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Preface for the Instructor 


You are about to teach a course that will probably give students their second 
exposure to linear algebra. During their first brush with the subject, your 
students probably worked with Euclidean spaces and matrices. In contrast, 
this course will emphasize abstract vector spaces and linear maps. 

The audacious title of this book deserves an explanation. Almost all 
linear algebra books use determinants to prove that every linear operator on 
a finite-dimensional complex vector space has an eigenvalue. Determinants 
are difficult, nonintuitive, and often defined without motivation. To prove the 
theorem about existence of eigenvalues on complex vector spaces, most books 
must define determinants, prove that a linear map is not invertible if and only 
if its determinant equals 0, and then define the characteristic polynomial. This 
tortuous (torturous?) path gives students little feeling for why eigenvalues 
exist. 

In contrast, the simple determinant-free proofs presented here (for example, 
see 5.21) offer more insight. Once determinants have been banished to the 
end of the book, a new route opens to the main goal of linear algebra— 
understanding the structure of linear operators. 

This book starts at the beginning of the subject, with no prerequisites 
other than the usual demand for suitable mathematical maturity. Even if your 
students have already seen some of the material in the first few chapters, they 
may be unaccustomed to working exercises of the type presented here, most 
of which require an understanding of proofs. 

Here is a chapter-by-chapter summary of the highlights of the book: 


e Chapter 1: Vector spaces are defined in this chapter, and their basic proper- 
ties are developed. 


e Chapter 2: Linear independence, span, basis, and dimension are defined in 
this chapter, which presents the basic theory of finite-dimensional vector 
spaces. 


xi 


xii 
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Chapter 3: Linear maps are introduced in this chapter. The key result here 
is the Fundamental Theorem of Linear Maps (3.22): if T is a linear map 
on V, then dim V = dimnull T + dimrange T. Quotient spaces and duality 
are topics in this chapter at a higher level of abstraction than other parts 
of the book; these topics can be skipped without running into problems 
elsewhere in the book. 


Chapter 4: The part of the theory of polynomials that will be needed 
to understand linear operators is presented in this chapter. This chapter 
contains no linear algebra. It can be covered quickly, especially if your 
students are already familiar with these results. 


Chapter 5: The idea of studying a linear operator by restricting it to small 
subspaces leads to eigenvectors in the early part of this chapter. The 
highlight of this chapter is a simple proof that on complex vector spaces, 
eigenvalues always exist. This result is then used to show that each linear 
operator on a complex vector space has an upper-triangular matrix with 
respect to some basis. All this is done without defining determinants or 
characteristic polynomials! 


Chapter 6: Inner product spaces are defined in this chapter, and their basic 
properties are developed along with standard tools such as orthonormal 
bases and the Gram-Schmidt Procedure. This chapter also shows how 
orthogonal projections can be used to solve certain minimization problems. 


Chapter 7: The Spectral Theorem, which characterizes the linear operators 
for which there exists an orthonormal basis consisting of eigenvectors, 
is the highlight of this chapter. The work in earlier chapters pays off 
here with especially simple proofs. This chapter also deals with positive 
operators, isometries, the Polar Decomposition, and the Singular Value 
Decomposition. 


Chapter 8: Minimal polynomials, characteristic polynomials, and gener- 
alized eigenvectors are introduced in this chapter. The main achievement 
of this chapter is the description of a linear operator on a complex vector 
space in terms of its generalized eigenvectors. This description enables 
one to prove many of the results usually proved using Jordan Form. For 
example, these tools are used to prove that every invertible linear operator 
on a complex vector space has a square root. The chapter concludes with a 
proof that every linear operator on a complex vector space can be put into 
Jordan Form. 


Preface for the Instructor xiii 


e Chapter 9: Linear operators on real vector spaces occupy center stage in 
this chapter. Here the main technique is complexification, which is a natural 
extension of an operator on a real vector space to an operator on a complex 
vector space. Complexification allows our results about complex vector 
spaces to be transferred easily to real vector spaces. For example, this 
technique is used to show that every linear operator on a real vector space 
has an invariant subspace of dimension 1 or 2. As another example, we 
show that that every linear operator on an odd-dimensional real vector space 
has an eigenvalue. 


e Chapter 10: The trace and determinant (on complex vector spaces) are 
defined in this chapter as the sum of the eigenvalues and the product of the 
eigenvalues, both counting multiplicity. These easy-to-remember defini- 
tions would not be possible with the traditional approach to eigenvalues, 
because the traditional method uses determinants to prove that sufficient 
eigenvalues exist. The standard theorems about determinants now become 
much clearer. The Polar Decomposition and the Real Spectral Theorem are 
used to derive the change of variables formula for multivariable integrals in 
a fashion that makes the appearance of the determinant there seem natural. 


This book usually develops linear algebra simultaneously for real and 
complex vector spaces by letting F denote either the real or the complex 
numbers. If you and your students prefer to think of F as an arbitrary field, 
then see the comments at the end of Section 1.A. I prefer avoiding arbitrary 
fields at this level because they introduce extra abstraction without leading 
to any new linear algebra. Also, students are more comfortable thinking 
of polynomials as functions instead of the more formal objects needed for 
polynomials with coefficients in finite fields. Finally, even if the beginning 
part of the theory were developed with arbitrary fields, inner product spaces 
would push consideration back to just real and complex vector spaces. 

You probably cannot cover everything in this book in one semester. Going 
through the first eight chapters is a good goal for a one-semester course. If 
you must reach Chapter 10, then consider covering Chapters 4 and 9 in fifteen 
minutes each, as well as skipping the material on quotient spaces and duality 
in Chapter 3. 

A goal more important than teaching any particular theorem is to develop in 
students the ability to understand and manipulate the objects of linear algebra. 
Mathematics can be learned only by doing. Fortunately, linear algebra has 
many good homework exercises. When teaching this course, during each 
class I usually assign as homework several of the exercises, due the next class. 
Going over the homework might take up a third or even half of a typical class. 
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Major changes from the previous edition: 


e This edition contains 561 exercises, including 337 new exercises that were 
not in the previous edition. Exercises now appear at the end of each section, 
rather than at the end of each chapter. 


e Many new examples have been added to illustrate the key ideas of linear 
algebra. 


e Beautiful new formatting, including the use of color, creates pages with an 
unusually pleasant appearance in both print and electronic versions. As a 
visual aid, definitions are in beige boxes and theorems are in blue boxes (in 
color versions of the book). 


e Each theorem now has a descriptive name. 


e New topics covered in the book include product spaces, quotient spaces, 
and duality. 


e Chapter 9 (Operators on Real Vector Spaces) has been completely rewritten 
to take advantage of simplifications via complexification. This approach 
allows for more streamlined presentations in Chapters 5 and 7 because 
those chapters now focus mostly on complex vector spaces. 


e Hundreds of improvements have been made throughout the book. For 
example, the proof of Jordan Form (Section 8.D) has been simplified. 


Please check the website below for additional information about the book. I 
may occasionally write new sections on additional topics. These new sections 
will be posted on the website. Your suggestions, comments, and corrections 
are most welcome. 

Best wishes for teaching a successful linear algebra class! 


Sheldon Axler 

Mathematics Department 

San Francisco State University 
San Francisco, CA 94132, USA 


website: Linear.axler.net 
e-mail: Linear@axler.net 
Twitter: @AxlerLinear 


Preface for the Student 


You are probably about to begin your second exposure to linear algebra. Unlike 
your first brush with the subject, which probably emphasized Euclidean spaces 
and matrices, this encounter will focus on abstract vector spaces and linear 
maps. These terms will be defined later, so don’t worry if you do not know 
what they mean. This book starts from the beginning of the subject, assuming 
no knowledge of linear algebra. The key point is that you are about to 
immerse yourself in serious mathematics, with an emphasis on attaining a 
deep understanding of the definitions, theorems, and proofs. 

You cannot read mathematics the way you read a novel. If you zip through a 
page in less than an hour, you are probably going too fast. When you encounter 
the phrase “as you should verify”, you should indeed do the verification, which 
will usually require some writing on your part. When steps are left out, you 
need to supply the missing pieces. You should ponder and internalize each 
definition. For each theorem, you should seek examples to show why each 
hypothesis is necessary. Discussions with other students should help. 

As a visual aid, definitions are in beige boxes and theorems are in blue 
boxes (in color versions of the book). Each theorem has a descriptive name. 

Please check the website below for additional information about the book. I 
may occasionally write new sections on additional topics. These new sections 
will be posted on the website. Your suggestions, comments, and corrections 
are most welcome. 

Best wishes for success and enjoyment in learning linear algebra! 


Sheldon Axler 

Mathematics Department 

San Francisco State University 
San Francisco, CA 94132, USA 


website: linear. axler.net 
e-mail: linear@axler.net 
Twitter: @AxlerLinear 
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CHAPTER 


René Descartes explaining his 
work to Queen Christina of 
Sweden. Vector spaces are a 
generalization of the 
description of a plane using 
two coordinates, as published 
by Descartes in 1637. 


Vector Spaces 


Linear algebra is the study of linear maps on finite-dimensional vector spaces. 
Eventually we will learn what all these terms mean. In this chapter we will 
define vector spaces and discuss their elementary properties. 

In linear algebra, better theorems and more insight emerge if complex 
numbers are investigated along with real numbers. Thus we will begin by 
introducing the complex numbers and their basic properties. 

We will generalize the examples of a plane and ordinary space to R” 
and C”, which we then will generalize to the notion of a vector space. The 
elementary properties of a vector space will already seem familiar to you. 

Then our next topic will be subspaces, which play a role for vector spaces 
analogous to the role played by subsets for sets. Finally, we will look at sums 
of subspaces (analogous to unions of subsets) and direct sums of subspaces 
(analogous to unions of disjoint sets). 


LEARNING OBJECTIVES FOR THIS CHAPTER 


basic properties of the complex numbers 
a R” and C” 


vector spaces 


m subspaces 


m sums and direct sums of subspaces 
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2 CHAPTER 1 Vector Spaces 


1.A R” and C” 


Complex Numbers 


You should already be familiar with basic properties of the set R of real 
numbers. Complex numbers were invented so that we can take square roots of 
negative numbers. The idea is to assume we have a square root of —1, denoted 
i, that obeys the usual rules of arithmetic. Here are the formal definitions: 


1.1 Definition complex numbers 


e A complex number is an ordered pair (a, b), where a,b € R, but 
we will write this as a + bi. 


e The set of all complex numbers is denoted by C: 


C = {a+ bi:a,b ER}. 


e Addition and multiplication on C are defined by 
(a+ bi)+(ce+di)=(at+c)+(b+4)i, 
(a+ bi)(c + di) = (ac — bd) + (ad + be)i; 
here a,b,c,d ER. 
If a € R, we identify a + Oi with the real number a. Thus we can think 


of R as a subset of C. We also usually write 0 + bi as just bi, and we usually 
write 0 + li as just i. 


The symbol i was first used to de- Using multiplication as defined 
note s/—1 by Swiss mathematician above, you should verify that i? = —1. 
Leonhard Euler in 1777. Do not memorize the formula for the 


product of two complex numbers; you 


can always rederive it by recalling that 


i? = —1 and then using the usual rules 


of arithmetic (as given by 1.3). 


1.2 Example Evaluate (2 + 37)(4+ 5i). 


Solution (2+ 31)(4 + 5i) = 2-442: (5i) + Bi) -4+ (3i)(5i) 
= 8+ 10i + 12i — 15 
= -7+ 2i 
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1.3. Properties of complex arithmetic 


commutativity 
œ + f = f + a and a&b = fa for alla, f € C; 


associativity 
(a+ b)+ =a+(B+A) and (a@B)A = a (få) for all æ, 6, à € C; 


identities 
à +0 = à and À1 = å forall À e C; 


additive inverse 
for every a € C, there exists a unique 6 € C such that œ + f = 0; 


multiplicative inverse 
for every a € C witha Æ 0, there exists a unique € C such that 
ap = 1; 

distributive property 
Ala + p) =Aa + AB for allA,a, p € C. 


The properties above are proved using the familiar properties of real 
numbers and the definitions of complex addition and multiplication. The 
next example shows how commutativity of complex multiplication is proved. 
Proofs of the other properties above are left as exercises. 


1.4 Example Show that af = Ba for alla, 8, à € C. 


Solution Suppose œ = a + bi and $ = c + di, where a,b,c,d € R. Then 
the definition of multiplication of complex numbers shows that 


ap = (a + bi)\(c + di) 
= (ac — bd) + (ad + bc)i 


and 


pa = (c +di)(a + bi) 
= (ca — db) + (cb + daji. 


The equations above and the commutativity of multiplication and addition of 
real numbers show that «f = Ba. 
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1.5 Definition —a, subtraction, 1/«, division 
Leta, B € C. 


e Let —a denote the additive inverse of œ. Thus —« is the unique 
complex number such that 


a+ (-—a) = 0. 
e Subtraction on C is defined by 
B-—a = P + (=o). 


e Fora +Æ 0, let 1/a denote the multiplicative inverse of æ. Thus 1/a 
is the unique complex number such that 


a(l/æ)= 1. 
e Division on C is defined by 


B/a = A/a). 


So that we can conveniently make definitions and prove theorems that 
apply to both real and complex numbers, we adopt the following notation: 


1.6 Notation F 
Throughout this book, F stands for either R or C. 


The letter F is used because R andù Thus if we prove a theorem involving 

C are examples of what are called F, we will know that it holds when F is 

fields. replaced with R and when F is replaced 
with C. 


Elements of F are called scalars. The word “scalar”, a fancy word for 
“number”, is often used when we want to emphasize that an object is a number, 
as opposed to a vector (vectors will be defined soon). 

For a € F and m a positive integer, we define œ” to denote the product of 
a with itself m times: 

a” = asa. 
ma 
m times 


Clearly (&”)” = a” and (aB)” = a” B™ for all a, B € F and all positive 
integers m,n. 
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Lists 


Before defining R” and C”, we look at two important examples. 


1.7 Example R? and R? 


e The set R?, which you can think of as a plane, is the set of all ordered 
pairs of real numbers: 


R? = {(x, y): x, y € R}. 
e The set R, which you can think of as ordinary space, is the set of all 
ordered triples of real numbers: 
R? = {(x, y,z): x,y,z E R}. 


To generalize R? and R? to higher dimensions, we first need to discuss the 
concept of lists. 


1.8 Definition list, length 


Suppose n is a nonnegative integer. A list of length n is an ordered 
collection of n elements (which might be numbers, other lists, or more 
abstract entities) separated by commas and surrounded by parentheses. A 
list of length n looks like this: 


OENE 


Two lists are equal if and only if they have the same length and the same 
elements in the same order. 


Thus a list of length 2 is an ordered Many mathematicians call a list of 
pair, and a list of length 3 is an ordered | length n an n-tuple. 
triple. ' 
Sometimes we will use the word list without specifying its length. Re- 
member, however, that by definition each list has a finite length that is a 
nonnegative integer. Thus an object that looks like 


(x1, X2,...), 


which might be said to have infinite length, is not a list. 

A list of length 0 looks like this: ( ). We consider such an object to be a 
list so that some of our theorems will not have trivial exceptions. 

Lists differ from sets in two ways: in lists, order matters and repetitions 
have meaning; in sets, order and repetitions are irrelevant. 
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1.9 Example lists versus sets 


e The lists (3,5) and (5, 3) are not equal, but the sets {3,5} and {5, 3} are 
equal. 


e The lists (4, 4) and (4, 4, 4) are not equal (they do not have the same 
length), although the sets {4, 4} and {4, 4, 4} both equal the set {4}. 


F” 
To define the higher-dimensional analogues of R? and R°, we will simply 
replace R with F (which equals R or C) and replace theFana 2 or 3 with an 
arbitrary positive integer. Specifically, fix a positive integer n for the rest of 
this section. 

1.10 Definition F” 

F” is the set of all lists of length n of elements of F: 

E = Nigeoeoda) a y E D n = 1.0). 


For (x1,...,%n) € F” and j € {1,...,n}, we say that x; is the j™ 
coordinate of (x1,...,Xn). 


If F = R and n equals 2 or 3, then this definition of F” agrees with our 
previous notions of R? and R?. 


1.11 Example Cf is the set of all lists of four complex numbers: 


C* = {(21, Z2, Z3, Z4) | Z1, Z2, Z3, 24 € Ch. 


For an amusing account of how | Ifn > 4, we cannot visualize R” 
R? would be perceived by crea-|| as a physical object. Similarly, C! can 
tures living in R?, read Flatland: | be thought of as a plane, but for n > 2, 
A Romance of Many Dimensions, | the human brain cannot provide a full 


by Edwin A. Abbott. This novel, | image of C”. However, even if n is 
published in 1884, may help you | 


agi ee Ff | large, we can perform algebraic manip- 
imagine a ySICAaL Space Of Jour or . : s . 

ee p ulations in F” as easily as in R? or R°. 
more dimensions. À A 
For example, addition in F” is defined 
as follows: 
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1.12 Definition addition in F” 


Addition in F” is defined by adding corresponding coordinates: 


(cee en) Vie) ena 


Often the mathematics of F” becomes cleaner if we use a single letter to 
denote a list of n numbers, without explicitly writing the coordinates. For 
example, the result below is stated with x and y in F” even though the proof 
requires the more cumbersome notation of (x1,..., Xn) and (y1,..., Yn). 


1.13 Commutativity of addition in F” 


If x, y € F”, then x + y = y +x. 


Proof Suppose x = (x1,..., Xn) and y = (j1,..., Yn). Then 


X+y = (X1,.--,Xn) + O1,- Yn) 
= (x1 + 1,.--,Xn + Yn) 
= (y1 +X1,---, Yn + Xn) 
= (¥1,---, Yn) + (X1,---, Xn) 
=y+x, 


where the second and fourth equalities above hold because of the definition of 
addition in F” and the third equality holds because of the usual commutativity 
of addition in F. m 


If a single letter is used to denote [The symbol m means “end of the \j 
an element of F”, then the same letter proof”. | | 
with appropriate subscripts is often used 7 ; . 
when coordinates must be displayed. For example, if x € F”, then letting x 
equal (x1,...,X,,) is good notation, as shown in the proof above. Even better, 
work with just x and avoid explicit coordinates when possible. 


1.14 Definition 0 
Let 0 denote the list of length n whose coordinates are all 0: 


0 =(0,...,0). 
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Here we are using the symbol 0 in two different ways—on the left side of the 
equation in 1.14, the symbol 0 denotes a list of length n, whereas on the right 
side, each 0 denotes a number. This potentially confusing practice actually 
causes no problems because the context always makes clear what is intended. 


1.15 Example Consider the statement that 0 is an additive identity for F”: 
x+0=x forallxeF’. 
Is the 0 above the number 0 or the list 0? 


Solution Here 0 is a list, because we have not defined the sum of an element 


of F” (namely, x) and the number 0. 


(x1, Xp) 


Elements of R? can be 
thought of as points 
or as vectors. 


A vector. 


Mathematical models of the econ- 
omy can have thousands of vari- 
ables, say X1,...,X5000, Which 
means that we must operate in 
R500! Such a space cannot be 
dealt with geometrically. However, 
the algebraic approach works well. 
Thus our subject is called linear 
algebra. 


A picture can aid our intuition. We 
will draw pictures in R? because we 
can sketch this space on 2-dimensional 
surfaces such as paper and blackboards. 
A typical element of R? is a point x = 
(x1, X2). Sometimes we think of x not 
as a point but as an arrow starting at the 
origin and ending at (x1, x2), as shown 
here. When we think of x as an arrow, 
we refer to it as a vector. 

When we think of vectors in R? as 
arrows, we can move an arrow parallel 
to itself (not changing its length or di- 
rection) and still think of it as the same 
vector. With that viewpoint, you will 
often gain better understanding by dis- 
pensing with the coordinate axes and 
the explicit coordinates and just think- 
ing of the vector, as shown here. 

Whenever we use pictures in R? 
or use the somewhat vague language 
of points and vectors, remember that 
these are just aids to our understand- 
ing, not substitutes for the actual math- 
ematics that we will develop. Although 
we cannot draw good pictures in high- 
dimensional spaces, the elements of 
these spaces are as rigorously defined 
as elements of R. 
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For example, (2, —3, 17, x, V2) is an element of R°, and we may casually 
refer to it as a point in R or a vector in RÎ without worrying about whether 
the geometry of R° has any physical meaning. 

Recall that we defined the sum of two elements of F” to be the element of 
F” obtained by adding corresponding coordinates; see 1.12. As we will now 
see, addition has a simple geometric interpretation in the special case of R?. 

Suppose we have two vectors x and 
y in R? that we want to add. Move 
the vector y parallel to itself so that its 
initial point coincides with the end point x+y x 
of the vector x, as shown here. The 
sum x + y then equals the vector whose 
initial point equals the initial point of 
x and whose end point equals the end The sum of two vectors. 
point of the vector y, as shown here. 

In the next definition, the 0 on the right side of the displayed equation 
below is the list 0 € F”. 


y 


1.16 Definition additive inverse in F” 


For x € F”, the additive inverse of x, denoted —x, is the vector —x € F” 
such that 
x + (—x) =0. 


In other words, if x = (x1, ..., Xn), then —x = (—X1,...,—Xp). 


For a vector x € R”, the additive in- 
verse —x is the vector parallel to x and 1 
with the same length as x but pointing in -x 
the opposite direction. The figure here 
illustrates this way of thinking about the 
additive inverse in R. 

Having dealt with addition in F”, we 
now turn to multiplication. We could 
define a multiplication in F” in a similar fashion, starting with two elements 
of F” and getting another element of F” by multiplying corresponding coor- 
dinates. Experience shows that this definition is not useful for our purposes. 
Another type of multiplication, called scalar multiplication, will be central 
to our subject. Specifically, we need to define what it means to multiply an 
element of F” by an element of F. 


A vector and its additive inverse. 
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1.17 Definition scalar multiplication in F” 


The product of a number A and a vector in F” is computed by multiplying 
each coordinate of the vector by A: 


MGSiins conde) = Cis occ aE 


here A € F and (x1,..., Xn) € F”. 


In scalar multiplication, we multi- Scalar multiplication has a nice ge- 


ply together a scalar and a vector, ometric interpretation in R°. IfA isa 
getting a vector. You may be famil- positive number and x is a vector in 
iar with the dot product in R? or Rĉ?, then Ax is the vector that points 
R®, in which we multiply together in the same direction as x and whose 


two vectors and get a scalar. Gen- 
eralizations of the dot product will 
become important when we study 
inner products in Chapter 6. 


length is A times the length of x. In 
other words, to get Ax, we shrink or 
stretch x by a factor of A, depending on 
whether A < lorA > 1. 

If A is a negative number and x is a 
vector in R?, then Ax is the vector that 
points in the direction opposite to that 
of x and whose length is |A| times the 
length of x, as shown here. 


(—3/2) x 
x 
Dx 


Scalar multiplication. 


Digression on Fields 


A field is a set containing at least two distinct elements called 0 and 1, along 
with operations of addition and multiplication satisfying all the properties 
listed in 1.3. Thus R and C are fields, as is the set of rational numbers along 
with the usual operations of addition and multiplication. Another example of 
a field is the set {0, 1} with the usual operations of addition and multiplication 
except that 1 + 1 is defined to equal 0. 

In this book we will not need to deal with fields other than R and C. 
However, many of the definitions, theorems, and proofs in linear algebra that 
work for both R and C also work without change for arbitrary fields. If you 
prefer to do so, throughout Chapters 1, 2, and 3 you can think of F as denoting 
an arbitrary field instead of R or C, except that some of the examples and 
exercises require that for each positive integer n we have 1 + 1 +-+- + 1 Æ 0. 

buaa 


n times 
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EXERCISES 1.A 


A U 


10 


11 


12 
13 
14 
15 
16 


Suppose a and b are real numbers, not both 0. Find real numbers c and 
d such that 
1/(a + bi)=c+ di. 


Show that 
-1 + J3i 
2 
is a cube root of 1 (meaning that its cube equals 1). 


Find two distinct square roots of i. 

Show that œ + 6 = 6 + a forall æ, ß € C. 

Show that (a + 6) +A = «æ + (F + å) forall æ, B,A € C. 
Show that (@#8)A = a (på) for alla, B,A € C. 


Show that for every a € C, there exists a unique 6 € C such that 
a+p=0. 


Show that for every a € C with œ ¥ 0, there exists a unique B € C such 
that aB = 1. 


Show that A(a + B) = Aw + ABP for allA,a, B € C. 
Find x € Rf such that 
(4, —3,1,7) + 2x = (5, 9, —6, 8). 
Explain why there does not exist A € C such that 
A(2 — 31,5 + 41, -6 + 7i) = (12 — 5i, 7 + 221, —32 — 9i). 
Show that (x + y) + z = x + (y + z) forall x,y,z € F”. 
Show that (ab)x = a(bx) for all x € F” andalla,b € F. 
Show that 1x = x for all x € F”. 
Show that A(x + y) = Ax + Ay for all A € F and all x, y € F”. 


Show that (a + b)x = ax + bx for all a,b € F and all x € F”. 
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1.B Definition of Vector Space 


The motivation for the definition of a vector space comes from properties of 
addition and scalar multiplication in F”: Addition is commutative, associative, 
and has an identity. Every element has an additive inverse. Scalar multiplica- 
tion is associative. Scalar multiplication by 1 acts as expected. Addition and 
scalar multiplication are connected by distributive properties. 

We will define a vector space to be a set V with an addition and a scalar 
multiplication on V that satisfy the properties in the paragraph above. 


1.18 Definition addition, scalar multiplication 


e Anaddition on a set V is a function that assigns an element u +v € V 
to each pair of elements u,v € V. 


e A scalar multiplication on a set V is a function that assigns an ele- 
ment Av € V to each À € F and each ve V. 


Now we are ready to give the formal definition of a vector space. 


1.19 Definition vector space 


A vector space is a set V along with an addition on V and a scalar multi- 
plication on V such that the following properties hold: 


commutativity 
u +v =v+u foral u,ve V; 


associativity 
(u +v) +w = u + (v + w) and (ab)v = a(bv) for all u,v,w € V 
and all a,b € F; 


additive identity 
there exists an element 0 € V such that v + 0 = v forallv € V; 


additive inverse 
for every v € V, there exists w € V such that v + w = 0; 


multiplicative identity 
lv = vforallv e V; 


distributive properties 
a(u + v) = au + av and (a+ b)v = av + bv for all a,b € F and 
all u,v € V. 
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The following geometric language sometimes aids our intuition. 


1.20 Definition vector, point 


Elements of a vector space are called vectors or points. 


The scalar multiplication in a vector space depends on F. Thus when we 
need to be precise, we will say that V is a vector space over F instead of 
saying simply that V is a vector space. For example, R” is a vector space over 
R, and C” is a vector space over C. 


1.21 Definition real vector space, complex vector space 
e A vector space over R is called a real vector space. 


e A vector space over C is called a complex vector space. 


Usually the choice of F is either obvious from the context or irrelevant. 
Thus we often assume that F is lurking in the background without specifically 
mentioning it. 

With the usual operations of addition [7y, simplest vector space contains | 
and scalar multiplication, F” is a vector only one point. In other words, {0} 
space over F, as you should verify. The | is a vector space. 
example of F” motivated our definition ' 
of vector space. 


1.22 Example  F° is defined to be the set of all sequences of elements 
of F: 
F” = {(x1,x2,...): xj € F for j = 1,2,...}. 


Addition and scalar multiplication on F” are defined as expected: 


(x1, X2,...) + (1, 2,---) = (x1 + y1,X2 + yo,...), 
(x1, X2,...) = (Ax1,AX2,...). 


With these definitions, F°° becomes a vector space over F, as you should 
verify. The additive identity in this vector space is the sequence of all 0’s. 


Our next example of a vector space involves a set of functions. 
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1.23 Notation F5 
e If S is a set, then F5 denotes the set of functions from S to F. 
e For f,g € F5, the sum f + g € F is the function defined by 
(f + 8)(x) = f(x) + (x) 
forall x € S. 


e For À € F and f e€ F5, the product Af € F5 is the function 
defined by 
AS) = Af (x) 


forall x € S. 
As an example of the notation above, if S is the interval [0, 1] and F = R, 
then R!-] is the set of real-valued functions on the interval [0, 1]. 
You should verify all three bullet points in the next example. 
1.24 Example F° is a vector space 


e If S is a nonempty set, then FS (with the operations of addition and 
scalar multiplication as defined above) is a vector space over F. 


e The additive identity of F5 is the function 0 : S — F defined by 
O(x) =0 


forall x € S. 


e For f € F5, the additive inverse of f is the function — f : S > F 


defined by 
CNE) = -—f@) 

forall x € S. 
The elements of the vector space Our previous examples of vector 
RI- are real-valued functions on | spaces, F” and F°, are special cases 
[0, 1], not lists. In general, a vector l of the vector space F5 because a list of 
space is an abstract entity whose || length n of numbers in F can be thought 
elements might be lists, functions, | of as a function from {1, 2, ..., n} to F 
or weird objects. and a sequence of numbers in F can be 


thought of as a function from the set of 
positive integers to F. In other words, we can think of F” as FH-2=--7} and 
we can think of F° as FtL2=-}, 
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Soon we will see further examples of vector spaces, but first we need to 
develop some of the elementary properties of vector spaces. 

The definition of a vector space requires that it have an additive identity. 
The result below states that this identity is unique. 


1.25 Unique additive identity 
A vector space has a unique additive identity. 
Proof Suppose 0 and 0’ are both additive identities for some vector space V. 


Then 
0’ =0'+0=040 =0, 


where the first equality holds because 0 is an additive identity, the second 
equality comes from commutativity, and the third equality holds because 0’ 
is an additive identity. Thus 0’ = 0, proving that V has only one additive 
identity. E 


Each element v in a vector space has an additive inverse, an element w in 
the vector space such that v + w = 0. The next result shows that each element 
in a vector space has only one additive inverse. 


1.26 Unique additive inverse 


Every element in a vector space has a unique additive inverse. 


Proof Suppose V is a vector space. Let v € V. Suppose w and w’ are additive 
inverses of v. Then 


w=w+0=w+0++w)=(w+y) +w =0+w =w. 
Thus w = w’, as desired. m 


Because additive inverses are unique, the following notation now makes 
sense. 


1.27 Notation —v, w-—v 
Let v,w € V. Then 
e —y denotes the additive inverse of v; 


e w -— v is defined to be w + (—v). 
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Almost all the results in this book involve some vector space. To avoid 
having to restate frequently that V is a vector space, we now make the 
necessary declaration once and for all: 


1.28 Notation V 
For the rest of the book, V denotes a vector space over F. 

In the next result, 0 denotes a scalar (the number 0 € F) on the left side of 
the equation and a vector (the additive identity of V) on the right side of the 
equation. 

1.29 The number 0 times a vector 


Ov = 0 for every v E V. 


Note that 1.29 asserts something Proof Forv € V, we have 


about scalar multiplication and the 
additive identity of V. The only 
part of the definition of a vector 
space that connects scalar multi- 
plication and vector addition is the 
distributive property. Thus the dis- 
tributive property must be used in 
the proof of 1.29. 


Ov = (04+ 0)v = Ov + Ov. 


Adding the additive inverse of Ov to both 
sides of the equation above gives 0 = 
Ov, as desired. a 


In the next result, 0 denotes the addi- 
tive identity of V. Although their proofs 
are similar, 1.29 and 1.30 are not identical. More precisely, 1.29 states that 
the product of the scalar 0 and any vector equals the vector 0, whereas 1.30 
states that the product of any scalar and the vector 0 equals the vector 0. 


1.30 A number times the vector 0 


a0 = 0 for every a € F. 
Proof Fora € F, we have 


a0 = a (0 + 0) = a0 + a0. 


Adding the additive inverse of a0 to both sides of the equation above gives 
0 = a0, as desired. E 


Now we show that if an element of V is multiplied by the scalar —1, then 
the result is the additive inverse of the element of V. 
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1.31 The number —1 times a vector 


(—1)v = —v for every v € V. 


Proof Forv € V, we have 
v+ (=1)} = 1v + (—1)v = (1 + (—1))v = Ov = 0. 


This equation says that (—1)v, when added to v, gives 0. Thus (—1)v is the 
additive inverse of v, as desired. a 


EXERCISES 1.B 


1 Prove that —(—v) = v for every v € V. 
2 Suppose a € F,v € V, and av = 0. Prove that a = 0 or v = 0. 


3 Suppose v, w € V. Explain why there exists a unique x € V such that 
v+3x=w. 


4 The empty set is not a vector space. The empty set fails to satisfy only 
one of the requirements listed in 1.19. Which one? 


5 Show that in the definition of a vector space (1.19), the additive inverse 
condition can be replaced with the condition that 


Ov = Oforallv € V. 


Here the 0 on the left side is the number 0, and the 0 on the right side is 
the additive identity of V. (The phrase “a condition can be replaced” in a 
definition means that the collection of objects satisfying the definition is 
unchanged if the original condition is replaced with the new condition.) 


6 Let co and —oo denote two distinct objects, neither of which is in R. 
Define an addition and scalar multiplication on R U {00} U {—oo} as you 
could guess from the notation. Specifically, the sum and product of two 
real numbers is as usual, and for ¢ € R define 


—oo if t <0, ee) if ¢ <0, 
too = 40 if t = 0, t(—œ)= 40 if t = 0, 
oò if t > 0, —oo ift>0, 
t +œ =œ +t = %, t + (—oo) = (—20) + t = —o0, 
o0 + œ = œ, (—20) + (—oo) = —o0, œ + (—œ0) = 0. 


Is R U {co} U {—oo} a vector space over R? Explain. 
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1.C  Subspaces 


By considering subspaces, we can greatly expand our examples of vector 


spaces. 


1.32 Definition subspace 


A subset U of V is called a subspace of V if U is also a vector space 
(using the same addition and scalar multiplication as on V). 


1.33 Example 


Some mathematicians use the term 
linear subspace, which means the 
same as subspace. 


1.34 Conditions for a subspace 


{(x1, x2, 0) : x1, x2 € F} is a subspace of F°. 


The next result gives the easiest way 
to check whether a subset of a vector 
space is a subspace. 


A subset U of V is a subspace of V if and only if U satisfies the following 


three conditions: 


additive identity 
OEU 


closed under addition 


u,w E€ U impliesu+weU; 


closed under scalar multiplication 
a € F and u € U implies au € U. 


above could be replaced with the 
condition that U is nonempty (then 
taking u € U, multiplying it by 0, 
and using the condition that U is 
closed under scalar multiplication 
would imply that 0 € U). However, 
if U is indeed a subspace of V, 
then the easiest way to show that U 
is nonempty is to show that 0 € U. 


The additive identity condition |\§ 


Proof If U is a subspace of V, then U 
satisfies the three conditions above by 
the definition of vector space. 

Conversely, suppose U satisfies the 
three conditions above. The first con- 
dition above ensures that the additive 
identity of V is in U. 

The second condition above ensures 
that addition makes sense on U. The 
third condition ensures that scalar mul- 
tiplication makes sense on U. 
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If u € U, then —u [which equals (—1)u by 1.31] is also in U by the third 
condition above. Hence every element of U has an additive inverse in U. 

The other parts of the definition of a vector space, such as associativity 
and commutativity, are automatically satisfied for U because they hold on the 
larger space V. Thus U is a vector space and hence is a subspace of V. E 


The three conditions in the result above usually enable us to determine 
quickly whether a given subset of V is a subspace of V. You should verify all 
the assertions in the next example. 


1.35 Example subspaces 


(a) Ifb €F, then 
{(x1, x2, x3, x4) € F* : x3 = 5x4 + b} 


is a subspace of F4 if and only if b = 0. 


(b) The set of continuous real-valued functions on the interval [0, 1] is a 
subspace of R/0.1), 


(c) The set of differentiable real-valued functions on R is a subspace of RÈ. 


(d) The set of differentiable real-valued functions f on the interval (0, 3) 
such that f’(2) = b is a subspace of R©-) if and only if b = 0. 


(e) The set of all sequences of complex numbers with limit 0 is a subspace 
of C”. 


Verifying some of the items above Clearly {0} is the smallest sub- 
shows the linear structure underlying | space of V and V itself is the 
parts of calculus. For example, the sec- | largest subspace of V. The empty 
ond item above requires the result that | set is not a subspace of V because 
the sum of two continuous functions is |4 subspace must be a vector space 
continuous. As another example, the | 44 hence must contain at least 
fourth item above requires the result ms element, nately, aa dedine 
Se tates identity. 
that for a constant c, the derivative of 4 
cf equals c times the derivative of f. 

The subspaces of R? are precisely {0}, R?, and all lines in R? through the 
origin. The subspaces of RÌ are precisely {0}, R3, all lines in R? through the 
origin, and all planes in R? through the origin. To prove that all these objects 
are indeed subspaces is easy—the hard part is to show that they are the only 
subspaces of R? and R3. That task will be easier after we introduce some 
additional tools in the next chapter. 
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Sums of Subspaces 


The union of subspaces is rarely a When dealing with vector spaces, we 
subspace (see Exercise 12), which are usually interested only in subspaces, 
is why we usually work with sums as opposed to arbitrary subsets. The 
rather than unions. notion of the sum of subspaces will be 
Be useful. 


1.36 Definition sum of subsets 


Suppose Uj,..., Um are subsets of V. The sum of U1, ..., Um, denoted 
U; +---+ Um, is the set of all possible sums of elements of U;,..., Um. 
More precisely, 


Ui +--+ Um = {u1 +--+ um : u1 E€ U4,...,Um E Um}. 


Let’s look at some examples of sums of subspaces. 


1.37 Example Suppose U is the set of all elements of F? whose second 
and third coordinates equal 0, and W is the set of all elements of F? whose 
first and third coordinates equal 0: 


U = {(x,0,0) E€ F?°:xe€F} and W = {(0,y,0) € F? : y eF}. 


Then 
U +W = {(x,y,0): x,y EF}, 


as you should verify. 


1.38 Example Suppose that U = {(x,x, y, y) € Ff : x,y € F} and 
W = {(x,x,x, y) € F4 : x, y € F}. Then 


U +W = 4(x,x,y,Z) e Ff: x,y,z € F}, 


as you should verify. 


The next result states that the sum of subspaces is a subspace, and is in 
fact the smallest subspace containing all the summands. 


1.39 Sum of subspaces is the smallest containing subspace 


Suppose Uj,...,Um are subspaces of V. Then U; + --- + Um is the 
smallest subspace of V containing U1, ..., Um. 
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Proof It is easy to see that 0 € U1 + --- + Um and that Ui + ---+ Um 
is closed under addition and scalar multiplication. Thus 1.34 implies that 
U; + -++ + Um is a subspace of V. 

Clearly U1,..., Um are all con- [sums of subspaces in the theory 
tained in U1 + ++- + Um (to see this, of vector spaces are analogous 
consider sums uy +--+ + Um where | to unions of subsets in set theory. 
all except one of the w’s are 0). Con- | Given two subspaces of a vector 
versely, every subspace of V contain- |space, the smallest subspace con- 
ing U,,...,Um contains Uy +--+ Um taining them is their sum. Analo- 

: gously, given two subsets of a Set, 
(because subspaces must contain all fi- . 
: : the smallest subset containing them 
nite sums of their elements). Thus | 5. 03+ union. 
U; +: + Um is the smallest subspace 7 
of V containing U;,...,Um. E 


Direct Sums 


Suppose U41, ..., Um are subspaces of V. Every element of U1 + <+- + Um 
can be written in the form 


ui +: + um, 
where each u; is in U;. We will be especially interested in cases where each 


vector in U1 + -+-+ + Um can be represented in the form above in only one 
way. This situation is so important that we give it a special name: direct sum. 


1.40 Definition direct sum 


Suppose U1, ...,Um are subspaces of V. 


e The sum U; +--- + Um is called a direct sum if each element 
of U; + --- + Um can be written in only one way as a sum 
uy +-::+ Um, where each uj is in Uj. 


e If Ui +---+ Um is a direct sum, then U; © --- © Um denotes 
U; +--+ + Um, with the @ notation serving as an indication that 
this is a direct sum. 


1.41 Example Suppose U is the subspace of F? of those vectors whose 
last coordinate equals 0, and W is the subspace of F? of those vectors whose 
first two coordinates equal 0: 

U = {(x,y,0)€F?:x,ye€F} and W = {(0,0,z) € F?°:z eF}. 
Then F? = U @ W, as you should verify. 
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1.42 Example Suppose U; is the subspace of F” of those vectors whose 
coordinates are all 0, except possibly in the j™ slot (thus, for example, 
Uz = {(0,x,0,...,0) € F” : x € F}). Then 

F” =U, @:--@Uh, 
as you should verify. 


Sometimes nonexamples add to our understanding as much as examples. 


1.43 Example Let 
Ui = {(x, y,0) € F? : x,y € F}, 
U> = {(0,0,z) € F? :z € F}, 
Uz = {(0, y, y) € F°: y € F}. 
Show that U1 + U2 + U3 is not a direct sum. 


Solution Clearly F? = U; + U2 + U3, because every vector (x, y, Z) € F? 
can be written as 


(x,y,z) = (x, y,0) + (0, 0, z) + (0,0,0), 


where the first vector on the right side is in U1, the second vector is in U2, 
and the third vector is in U3. 

However, F? does not equal the direct sum of U1, U2, U3, because the 
vector (0, 0,0) can be written in two different ways as a sum u1 + U2 + U3, 
with each u; in U;. Specifically, we have 


(0, 0,0) = (0, 1,0) + (0,0, 1) + (0, —1, —1) 
and, of course, 
(0, 0,0) = (0,0, 0) + (0,0, 0) + (0, 0,0), 


where the first vector on the right side of each equation above is in Uj, the 
second vector is in U2, and the third vector is in U3. 


The symbol ®, which is a plus | The definition of direct sum requires 


sign inside a circle, serves as a re- that every vector in the sum have a 
minder that we are dealing with a unique representation as an appropriate 
special type of sum of subspaces— sum. The next result shows that when 
each element in the direct sum can deciding whether a sum of subspaces 


be represented only one way as a 
sum of elements from the specified 
subspaces. 


is a direct sum, we need only consider 
whether 0 can be uniquely written as an 
appropriate sum. 
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1.44 Condition for a direct sum 


Suppose U1, ..., Um are subspaces of V. Then U1 + --- + Um is a direct 
sum if and only if the only way to write 0 as a sum u1 +--+: + Um, where 
each u; is in U;, is by taking each u; equal to 0. 


Proof First suppose U1 + --- + Um is a direct sum. Then the definition of 
direct sum implies that the only way to write 0 as a sum u1 + ++ + um, where 
each u; is in U;, is by taking each u ; equal to 0. 

Now suppose that the only way to write 0 as a sum u1 +--+: + Um, where 
each u ; is in U}, is by taking each u ; equal to 0. To show that U1 +---+ Um 
is a direct sum, let v € Uy +--- + Um. We can write 


v= upto +m 


for some u1 € Uj,...,Um E Um. To show that this representation is unique, 
suppose we also have 
v= vi ee E Vm, 


where vı € U1,...,Vm E Um. Subtracting these two equations, we have 

0 = (u1 = v1) ++ + (im = Vm). 
Because uy — vy € U1,..., Um —Vm E Um, the equation above implies that 
each u; — vj equals 0. Thus u1 = v1,..., Um = Vm, as desired. E 


The next result gives a simple condition for testing which pairs of sub- 
spaces give a direct sum. 


1.45 Direct sum of two subspaces 


Suppose U and W are subspaces of V. Then U + W is a direct sum if 
and only if U N W = {0}. 


Proof First suppose that U + W is a direct sum. Ifv € U N W, then 
0 = v + (-»), where v € U and —v € W. By the unique representation 
of 0 as the sum of a vector in U and a vector in W, we have v = 0. Thus 
U N W = {0}, completing the proof in one direction. 

To prove the other direction, now suppose U N W = {0}. To prove that 
U + W isa direct sum, suppose u € U, w € W, and 


O=u+w. 


To complete the proof, we need only show that u = w = 0 (by 1.44). The 
equation above implies that u = —w € W. Thus u € U N W. Hence u = 0, 
which by the equation above implies that w = 0, completing the proof. E 
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Sums of subspaces are analogous | 


can be disjoint, because both con- 
tain 0. So disjointness is replaced, 
at least in the case of two sub- 
spaces, with the requirement that 


The result above deals only with 


to unions of subsets. Similarly, di- the case of two subspaces. When ask- 
rect sums of subspaces are analo- ing about a possible direct sum with 
gous to disjoint unions of subsets. more than two subspaces, it is not 
No two subspaces of a vector space enough to test that each pair of the 


subspaces intersect only at 0. To see 
this, consider Example 1.43. In that 
nonexample of a direct sum, we have 


the intersection equals {0}. Ui N U2 = U1 N U3 = U2NU3 = {0}. 


EXERCISES 1.C 


For each of the following subsets of F?, determine whether it is a sub- 
space of F°: 

(b)  {(x1, x2, x3) € F? : xy + 2x2 + 3x3 = 4}; 

(©) {(%1,%2,%x3) € F? : x1xX2X3 = 0}; 


(d) {(x1, x2, x3) € F? : x1 = 5x3}. 


Verify all the assertions in Example 1.35. 


Show that the set of differentiable real-valued functions f on the interval 
(—4, 4) such that f’(—1) = 3f (2) is a subspace of RC4-®., 


Suppose b € R. Show that the set of continuous real-valued functions f 
on the interval [0, 1] such that de f = bisa subspace of R!1 if and 
only if b = 0. 


Is R? a subspace of the complex vector space C?? 
(a) Is {(a,b,c) € R? : a? = b>} a subspace of R?? 
(b) Is {(a,b,c) € C? : a? = b?} a subspace of C3? 


Give an example of a nonempty subset U of R? such that U is closed 
under addition and under taking additive inverses (meaning —u € U 
whenever u € U), but U is not a subspace of R. 


Give an example of a nonempty subset U of R? such that U is closed 
under scalar multiplication, but U is not a subspace of R?. 
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A function f : R —> R is called periodic if there exists a positive number 
p such that f(x) = f(x + p) for all x € R. Is the set of periodic 
functions from R to R a subspace of RR? Explain. 


Suppose U; and U2 are subspaces of V. Prove that the intersection 
Uı N U2 is a subspace of V. 


Prove that the intersection of every collection of subspaces of V is a 
subspace of V. 


Prove that the union of two subspaces of V is a subspace of V if and 
only if one of the subspaces is contained in the other. 


Prove that the union of three subspaces of V is a subspace of V if and 
only if one of the subspaces contains the other two. 

[This exercise is surprisingly harder than the previous exercise, possibly 
because this exercise is not true if we replace F with a field containing 
only two elements. ] 


Verify the assertion in Example 1.38. 
Suppose U is a subspace of V. What is U + U? 


Is the operation of addition on the subspaces of V commutative? In other 
words, if U and W are subspaces of V, is U + W = W + U? 


Is the operation of addition on the subspaces of V associative? In other 
words, if U1, U2, U3 are subspaces of V, is 


(Ui + Uz) + U3 = Uy + (U2 + U3)? 


Does the operation of addition on the subspaces of V have an additive 
identity? Which subspaces have additive inverses? 


Prove or give a counterexample: if U1, U2, W are subspaces of V such 
that 
Ui+W =U: +W, 


then U1 = U2. 


Suppose 
U = {(x,x, y, y) E Ff: x,y E F}. 


Find a subspace W of F* such that F* = U @ W. 


21 


22 


23 


24 


CHAPTER 1 Vector Spaces 


Suppose 

U = {(x,y,x + y,x—y,2x) € F°: x,y € F}. 
Find a subspace W of F° such that F? = U @ W. 
Suppose 

U ={(x,y,x+ y,x—y,2x) € F’: x,y € F}. 


Find three subspaces W1, W2, W3 of F 5 none of which equals {0}, such 
that F? = U @ W, © W2 © Ws. 


Prove or give a counterexample: if U1, U2, W are subspaces of V such 
that 
V=U,;8W and V=U.EW, 


then U1 = U2. 


A function f : R > R is called even if 


fx) = fœ) 
for all x € R. A function f : R —> R is called odd if 
fx) = -f Œ) 


for all x € R. Let Ue denote the set of real-valued even functions on R 
and let U, denote the set of real-valued odd functions on R. Show that 
RÈ = U. @ Up. 


CHAPTER 


American mathematician Paul 
Halmos (1916-2006), who in 1942 
published the first modern linear 
algebra book. The title of 
Halmos’s book was the same as the 
title of this chapter. 


Finite-Dimensional 
Vector Spaces 


Let’s review our standing assumptions: 


2.1 Notation F, V 
e F denotes R or C. 


e V denotes a vector space over F. 


In the last chapter we learned about vector spaces. Linear algebra focuses 
not on arbitrary vector spaces, but on finite-dimensional vector spaces, which 
we introduce in this chapter. 


LEARNING OBJECTIVES FOR THIS CHAPTER 
m span 
m linear independence 
m bases 


m dimension 
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2.A | Span and Linear Independence 


We have been writing lists of numbers surrounded by parentheses, and we will 
continue to do so for elements of F”; for example, (2, —7, 8) € F?. However, 
now we need to consider lists of vectors (which may be elements of F” or of 
other vector spaces). To avoid confusion, we will usually write lists of vectors 
without surrounding parentheses. For example, (4, 1, 6), (9,5, 7) is a list of 
length 2 of vectors in R°. 


2.2 Notation list of vectors 


We will usually write lists of vectors without surrounding parentheses. 


Linear Combinations and Span 


Adding up scalar multiples of vectors in a list gives what is called a linear 
combination of the list. Here is the formal definition: 


2.3 Definition linear combination 


A linear combination of a list v1,...,Vm of vectors in V is a vector of 
the form 


divi +++: + amVm, 


where dj,...,dm €F. 


24 Example InF’, 
e (17, —4, 2) is a linear combination of (2, 1, —3), (1, —2, 4) because 
(17, —4, 2) = 6(2, 1, —3) + 51, —2, 4). 
e (17, —4, 5) is not a linear combination of (2, 1, —3), (1, —2, 4) because 
there do not exist numbers a1, a2 € F such that 
(17, —4,5) = a1 (2, 1,—3) + a2(1, —2, 4). 
In other words, the system of equations 
17 = 2a, +a 
—4 = aı — 2a2 
5 = —3a, + 4a2 


has no solutions (as you should verify). 
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2.5 Definition span 


The set of all linear combinations of a list of vectors v1,...,Vm in V is 
called the span of v1, .. . , vm, denoted span(v1,..., Vm). In other words, 
span(v1,...,Vm) = {a1vı +--+: + GmVm :a1,...,d4m E F}. 


The span of the empty list ( ) is defined to be {0}. 


2.6 Example The previous example shows that in F°, 
e (17,—4, 2) € span((2, 1, —3), (1, —2, 4)); 
e (17,—4,5) ¢ span((2, 1, —3), (1, —2, 4)). 
Some mathematicians use the term linear span, which means the same as 


span. 


2.7 Span is the smallest containing subspace 


The span of a list of vectors in V is the smallest subspace of V containing 
all the vectors in the list. 


Proof Suppose v1,..., Vm is a list of vectors in V. 
First we show that span(v1,...,Vm) is a subspace of V. The additive 
identity is in span(v1,..., Vm), because 


0 = Ovi +--- + Ovn. 
Also, span(v1,...,Vm) is closed under addition, because 
(ayvy tess +admV¥m)+ (C1v1 +- -+CmYm) = (a1 +c1)vi +++ +(Gmtem)vm- 
Furthermore, span (v1, . . . , vm) is closed under scalar multiplication, because 
A (aivi +--+ dmVm) = Aa1v1 ++: +Aamvm. 


Thus span(v1, ..., Vm) is a subspace of V (by 1.34). 

Each v; is a linear combination of v1, ..., vm (to show this, seta; = 1 
and let the other a’s in 2.3 equal 0). Thus span(v1, .. . , vm) contains each v;. 
Conversely, because subspaces are closed under scalar multiplication and 
addition, every subspace of V containing each v; contains span(v1,..., Vm). 
Thus span(v1, .. . , Vj) is the smallest subspace of V containing all the vectors 
Vili ise Vm. || 
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2.8 Definition spans 


If span(v1,..., vm) equals V, we say that v1,..., Vm spans V. 


2.9 Example Suppose n is a positive integer. Show that 
(1,0,...,0),(0,1,0,...,0),...,(0,...,0,1) 


spans F”. Here the j" vector in the list above is the n-tuple with 1 in the j® 
slot and 0 in all other slots. 


Solution Suppose (x1, ..., Xn) € F”. Then 
(X1,---, Xn) = x1(1,0,...,0) + x2(0,1,0,..., 0) + +--+ xn(0,...,0, 1). 


Thus (x1,..., Xn) € span((1,0,...,0), (0, 1,0,...,0),...,(0,...,0, 1)), as 
desired. 


Now we can make one of the key definitions in linear algebra. 


2.10 Definition finite-dimensional vector space 


A vector space is called finite-dimensional if some list of vectors in it 
spans the space. 


Recall that by definition every list Ñ Example 2.9 above shows that F” 
has finite length. is a finite-dimensional vector space for 
. = every positive integer n. 

The definition of a polynomial is no doubt already familiar to you. 


2.11 Definition polynomial, P(F) 


e A function p: F = F is called a polynomial with coefficients in F 
if there exist d9,...,@m € F such that 


p(z) = ao +a1Zz Ee TE 
forall z € F. 


e P(F) is the set of all polynomials with coefficients in F. 
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With the usual operations of addition and scalar multiplication, P(F) is a 
vector space over F, as you should verify. In other words, P(F) is a subspace 
of FF, the vector space of functions from F to F. 

If a polynomial (thought of as a function from F to F) is represented by 
two sets of coefficients, then subtracting one representation of the polynomial 
from the other produces a polynomial that is identically zero as a function 
on F and hence has all zero coefficients (if you are unfamiliar with this fact, 
just believe it for now; we will prove it later—see 4.7). Conclusion: the 
coefficients of a polynomial are uniquely determined by the polynomial. Thus 
the next definition uniquely defines the degree of a polynomial. 


2.12 Definition degree of a polynomial, deg p 


e A polynomial p € P(F) is said to have degree m if there exist 
scalars dg,41,...,4m E€ F with am 4 0 such that 


P(Z) = ao +412 + +++ + amz” 
for all z € F. If p has degree m, we write deg p = m. 
e The polynomial that is identically 0 is said to have degree —oo. 


In the next definition, we use the convention that —co < m, which means 
that the polynomial 0 is in Py» (F). 


213 Definition P,, (F) 


For m a nonnegative integer, Pm(F) denotes the set of all polynomials 
with coefficients in F and degree at most m. 


To verify the next example, note that P,,(F) = span(1,z,..., z”); here 
we are slightly abusing notation by letting z* denote a function. 


2.14 Example Pm(F)is a finite-dimensional vector space for each non- 


negative integer m. 


2.15 Definition infinite-dimensional vector space 


A vector space is called infinite-dimensional if it is not finite-dimensional. 
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2.16 Example Show that P(F) is infinite-dimensional. 


Solution Consider any list of elements of P(F). Let m denote the highest 
degree of the polynomials in this list. Then every polynomial in the span of 
this list has degree at most m. Thus z’”T! is not in the span of our list. Hence 
no list spans P(F). Thus P (F) is infinite-dimensional. 


Linear Independence 


Suppose vj,...,Vm E€ V and v € span(vı, .. ., vm). By the definition of span, 
there exist @1,...,@m E F such that 


v = 41V1 +: + amYm. 


Consider the question of whether the choice of scalars in the equation above 
is unique. Suppose c1, ...,Cm is another set of scalars such that 


v = C1V1 F°- F CmYm. 
Subtracting the last two equations, we have 
0 = (41 —¢1)v1 +++: + (am — Cm)vm- 


Thus we have written 0 as a linear combination of (v1, .. ., Vm). If the only 
way to do this is the obvious way (using 0 for all scalars), then each a; — c j 
equals 0, which means that each a; equals c; (and thus the choice of scalars 
was indeed unique). This situation is so important that we give it a special 
name—linear independence—which we now define. 


2.17 Definition linearly independent 


e A list vy,...,Vm of vectors in V is called linearly independent if 
the only choice of a1,...,@m E F that makes ayvy +---+dmVm 
equal O is a1 =-:-=dm = 0. 


e The empty list ( ) is also declared to be linearly independent. 


The reasoning above shows that v1,..., Vm is linearly independent if and 
only if each vector in span(v1, . . . , Vm) has only one representation as a linear 
combination of v1,..., Vm. 


2.18 


(a) 
(b) 


(c) 
(d) 
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Example linearly independent lists 


A list v of one vector v € V is linearly independent if and only if v Æ 0. 


A list of two vectors in V is linearly independent if and only if neither 
vector is a scalar multiple of the other. 


(1,0, 0,0), (0, 1,0, 0), (0, 0, 1, 0) is linearly independent in F*. 


The list 1, z,..., z” is linearly independent in P(F) for each nonnega- 
tive integer m. 


If some vectors are removed from a linearly independent list, the remaining 
list is also linearly independent, as you should verify. 


2.19 Definition linearly dependent 


2.20 


e A list of vectors in V is called linearly dependent if it is not linearly 
independent. 


e In other words, a list v1,...,vm of vectors in V is linearly de- 
pendent if there exist a1,...,dm €E F, not all 0, such that 
aıvı +++: +dmVvm = 0. 


Example linearly dependent lists 
(2,3, 1), 1, —1, 2), (7, 3, 8) is linearly dependent in F? because 
2(2, 3,1) + 3, —-1, 2) + (-1)(7, 3, 8) = (0, 0, 0). 


The list (2,3, 1), (1, —1, 2), (7, 3, c) is linearly dependent in F? if and 
only if c = 8, as you should verify. 


If some vector in a list of vectors in V is a linear combination of the 
other vectors, then the list is linearly dependent. (Proof: After writing 
one vector in the list as equal to a linear combination of the other 
vectors, move that vector to the other side of the equation, where it will 
be multiplied by —1.) 


Every list of vectors in V containing the 0 vector is linearly dependent. 
(This is a special case of the previous bullet point.) 
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The lemma below will often be useful. It states that given a linearly 
dependent list of vectors, one of the vectors is in the span of the previous ones 
and furthermore we can throw out that vector without changing the span of 
the original list. 


2.21 Linear Dependence Lemma 


Suppose v1,...,Vm is a linearly dependent list in V. Then there exists 
j € {1,2,...,m} such that the following hold: 


(a) vj € span(vy,...,vj-1)3 
(b) ifthe j th term is removed from v1, .. . , Vm, the span of the remain- 
ing list equals span(v1,..., Vm). 
Proof Because the list v1,..., Vm is linearly dependent, there exist numbers 
aj,...,dm E F, not all 0, such that 


1V1 +:::+dmVvm = 0. 


Let j be the largest element of {1,...,m} such that a; A 0. Then 


2.22 v; = ——v]—: Vi-4; 
1 aj i aj = 
proving (a). 
To prove (b), suppose u € span(v1,...,Vm). Then there exist numbers 
C1,...,Cm E F such that 


u = C1V1 +++ + CmVm. 


In the equation above, we can replace v; with the right side of 2.22, which 
shows that u is in the span of the list obtained by removing the j™ term from 
V1, ..., Vm. Thus (b) holds. m 


Choosing j = 1 in the Linear Dependence Lemma above means that 
vı = 0, because if j = 1 then condition (a) above is interpreted to mean that 
vı € span( ); recall that span( ) = {0}. Note also that the proof of part (b) 
above needs to be modified in an obvious way if vı = O and j = 1. 

In general, the proofs in the rest of the book will not call attention to 
special cases that must be considered involving empty lists, lists of length 1, 
the subspace {0}, or other trivial cases for which the result is clearly true but 
needs a slightly different proof. Be sure to check these special cases yourself. 

Now we come to a key result. It says that no linearly independent list in V 
is longer than a spanning list in V. 
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2.23 Length of linearly independent list < length of spanning list 


In a finite-dimensional vector space, the length of every linearly indepen- 
dent list of vectors is less than or equal to the length of every spanning list 
of vectors. 


Proof Suppose v1,...,Um is linearly independent in V. Suppose also that 
W1,...,Wn Spans V. We need to prove that m < n. We do so through the 
multi-step process described below; note that in each step we add one of the 
u’s and remove one of the w’s. 


Step 1 
Let B be the list w1, .. . , wn, which spans V. Thus adjoining any vector 
in V to this list produces a linearly dependent list (because the newly 
adjoined vector can be written as a linear combination of the other 
vectors). In particular, the list 


U1, W1i,..., Wn 


is linearly dependent. Thus by the Linear Dependence Lemma (2.21), 
we can remove one of the w’s so that the new list B (of length n) 
consisting of u; and the remaining w’s spans V. 


Step j 

The list B (of length n) from step j — 1 spans V. Thus adjoining any 
vector to this list produces a linearly dependent list. In particular, the 
list of length (n + 1) obtained by adjoining u j to B, placing it just after 
u1, ..., u j—1,is linearly dependent. By the Linear Dependence Lemma 
(2.21), one of the vectors in this list is in the span of the previous ones, 
and because u1,...,uj is linearly independent, this vector is one of 
the w’s, not one of the u’s. We can remove that w from B so that the 
new list B (of length n) consisting of u1, ..., uj and the remaining w’s 
spans V. 


After step m, we have added all the u’s and the process stops. At each step as 
we add a u to B, the Linear Dependence Lemma implies that there is some w 
to remove. Thus there are at least as many w’s as u’s. E 


The next two examples show how the result above can be used to show, 
without any computations, that certain lists are not linearly independent and 
that certain lists do not span a given vector space. 
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2.24 Example Show that the list (1, 2, 3), (4,5, 8), (9, 6, 7), (—3, 2, 8) is 
not linearly independent in R°. 


Solution The list (1,0, 0), (0, 1,0), (0,0, 1) spans R?. Thus no list of length 
larger than 3 is linearly independent in R°. 


2.25 Example Show that the list (1,2, 3, —5), (4,5, 8, 3), (9, 6, 7, -1) 
does not span R4. 


Solution The list (1,0, 0, 0), (0, 1,0, 0), (0, 0, 1, 0), (0, 0, 0, 1) is linearly in- 
dependent in R4. Thus no list of length less than 4 spans R4. 


Our intuition suggests that every subspace of a finite-dimensional vector 
space should also be finite-dimensional. We now prove that this intuition is 
correct. 


2.26 Finite-dimensional subspaces 


Every subspace of a finite-dimensional vector space is finite-dimensional. 


Proof Suppose V is finite-dimensional and U is a subspace of V. We need to 
prove that U is finite-dimensional. We do this through the following multi-step 
construction. 


Step 1 
If U = {0}, then U is finite-dimensional and we are done. If U Æ {0}, 
then choose a nonzero vector vı € U. 


Step j 
If U = span(vı,...,vj—1), then U is finite-dimensional and we are 
done. If U Æ span(v1,...,vj—1), then choose a vector v; € U such 
that 


v; ¢ span(v1,...,vj—1). 


After each step, as long as the process continues, we have constructed a list of 
vectors such that no vector in this list is in the span of the previous vectors. 
Thus after each step we have constructed a linearly independent list, by the 
Linear Dependence Lemma (2.21). This linearly independent list cannot be 
longer than any spanning list of V (by 2.23). Thus the process eventually 
terminates, which means that U is finite-dimensional. a 
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EXERCISES 2:A 


1 


10 


Suppose v1, v2, v3, v4 Spans V. Prove that the list 
V1 — V2, V2 — V3, V3 — V4, V4 
also spans V. 
Verify the assertions in Example 2.18. 
Find a number ¢ such that 
(3, 1, 4), (2, —3, 5), (5, 9,2) 
is not linearly independent in R°. 
Verify the assertion in the second bullet point in Example 2.20. 


(a) Show that if we think of C as a vector space over R, then the list 
(1 +7, 1 — i) is linearly independent. 


(b) Show that if we think of C as a vector space over C, then the list 
(1 +i, 1 — i) is linearly dependent. 


Suppose v1, v2, v3, v4 is linearly independent in V. Prove that the list 
V1 — V2, V2 — V3, V3 — V4, V4 
is also linearly independent. 


Prove or give a counterexample: If v1, v2,..., Vm is a linearly indepen- 
dent list of vectors in V, then 


5v1 — 4v2, V2, V3,..., Vm 
is linearly independent. 


Prove or give a counterexample: If v1, v2,..., Vm is a linearly indepen- 
dent list of vectors in V and A € F with A Æ 0, then Avy, Av2,...,Avm 
is linearly independent. 


Prove or give a counterexample: If v1, ..., Vm and w1,...,Wm are lin- 
early independent lists of vectors in V, then vı + w1,...,Vm + Wm iS 
linearly independent. 


Suppose v1,..., Vm is linearly independent in V and w € V. Prove that 
if vı +w,...,vm + w is linearly dependent, then w € span(v1,..., Vm). 
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Suppose v1,..., Vm is linearly independent in V and w € V. Show that 
V1, ..., Vm, W is linearly independent if and only if 


w € span(v1,..., Vm). 


Explain why there does not exist a list of six polynomials that is linearly 
independent in P4(F). 


Explain why no list of four polynomials spans P4(F). 


Prove that V is infinite-dimensional if and only if there is a sequence 
V1,V2,... Of vectors in V such that v1, ...,vm is linearly independent 
for every positive integer m. 


Prove that F° is infinite-dimensional. 


Prove that the real vector space of all continuous real-valued functions 
on the interval [0, 1] is infinite-dimensional. 


Suppose po, P1..-., Pm are polynomials in Pm (F) such that p;(2) = 0 
for each j. Prove that po, p1,..., Pm is not linearly independent in 
Pm(F). 
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2.B | Bases 


In the last section, we discussed linearly independent lists and spanning lists. 
Now we bring these concepts together. 
2.27 Definition basis 


A basis of V is a list of vectors in V that is linearly independent and 
spans V. 


2.28 Example bases 


(a) The list (1,0,...,0), (0,1,0,...,0),...,(0,...,0, 1) is a basis of F”, 
called the standard basis of F”. 


(b) The list (1, 2), (3, 5) is a basis of F?. 


(c) The list (1, 2, —4), (7, —5, 6) is linearly independent in F? but is not a 
basis of F? because it does not span F°. 


(d) The list (1, 2), (3, 5), (4, 13) spans F? but is not a basis of F? because 
it is not linearly independent. 


(e) The list (1, 1,0), (0, 0, 1) is a basis of {(x, x, y) € F3: x, y € F}. 

(f) The list (1, —1, 0), (1, 0, —1) is a basis of 
{(x,y,z)€Fo:x+y+z=0}. 

(g) The list 1,z,...,2’” is a basis of Pm(F). 


In addition to the standard basis, F” has many other bases. For example, 
(7,5), (—4, 9) and (1, 2), (3, 5) are both bases of F°. 

The next result helps explain why bases are useful. Recall that “uniquely” 
means “in only one way”. 


2.29 Criterion for basis 


A list v1, .. . , Vn of vectors in V is a basis of V if and only if every v € V 
can be written uniquely in the form 


2.30 v = dıvı +--+ ann, 


where Chg oo 09C € F. 
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Proof First suppose that vj,...,v, is a basis of V. Letv € V. Because 
V1,...,Vn spans V, there exist a1,...,an E F such that 2.30 holds. To 
This proof is essentially a repeti- show that the representation in 2.30 is 
tion of the ideas that led us to the unique, suppose Cc 1,...,Cy are scalars 
definition of linear independence. such that we also have 


v = CV +: + CnYn. 
Subtracting the last equation from 2.30, we get 
0 = (aj — c1)vı ++ + (an — Ca) Vn. 


This implies that each a; — cj equals 0 (because v1,..., Vn is linearly inde- 
pendent). Hence aj = cj,...,dn = Cn. We have the desired uniqueness, 
completing the proof in one direction. 

For the other direction, suppose every v € V can be written uniquely in 
the form given by 2.30. Clearly this implies that v1, . . . , vn spans V. To show 
that v1, ..., Vn is linearly independent, suppose a1,...,@, € F are such that 


0 = aivi +--+ + ayn. 


The uniqueness of the representation 2.30 (taking v = 0) now implies that 
ay =- =a, = 0. Thus v1,...,vn is linearly independent and hence is a 
basis of V. E 


A spanning list in a vector space may not be a basis because it is not 
linearly independent. Our next result says that given any spanning list, some 
(possibly none) of the vectors in it can be discarded so that the remaining list 
is linearly independent and still spans the vector space. 

As an example in the vector space F?, if the procedure in the proof below 
is applied to the list (1, 2), (3, 6), (4, 7), (5, 9), then the second and fourth 
vectors will be removed. This leaves (1, 2), (4, 7), which is a basis of F?. 


2.31 Spanning list contains a basis 


Every spanning list in a vector space can be reduced to a basis of the 
vector space. 


Proof Suppose v1,...,V¥, spans V. We want to remove some of the vectors 
from v1,...,V, so that the remaining vectors form a basis of V. We do this 
through the multi-step process described below. 

Start with B equal to the list v1,..., vn. 
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Step 1 
If vı = 0, delete vı from B. If vı Æ 0, leave B unchanged. 

Step j 
If v; is in span(v,...,vj—1), delete v; from B. If v; is not in 
span(vj,...,vj;—1), leave B unchanged. 


Stop the process after step n, getting a list B. This list B spans V because our 
original list spanned V and we have discarded only vectors that were already 
in the span of the previous vectors. The process ensures that no vector in B 
is in the span of the previous ones. Thus B is linearly independent, by the 
Linear Dependence Lemma (2.21). Hence B is a basis of V. C] 


Our next result, an easy corollary of the previous result, tells us that every 
finite-dimensional vector space has a basis. 


2.32 Basis of finite-dimensional vector space 


Every finite-dimensional vector space has a basis. 


Proof By definition, a finite-dimensional vector space has a spanning list. 
The previous result tells us that each spanning list can be reduced to a basis. m 


Our next result is in some sense a dual of 2.31, which said that every 
spanning list can be reduced to a basis. Now we show that given any linearly 
independent list, we can adjoin some additional vectors (this includes the 
possibility of adjoining no additional vectors) so that the extended list is still 
linearly independent but also spans the space. 


2.33 Linearly independent list extends to a basis 


Every linearly independent list of vectors in a finite-dimensional vector 
space can be extended to a basis of the vector space. 


Proof Suppose u1,..., um is linearly independent in a finite-dimensional 
vector space V. Let w1,..., Wy» be a basis of V. Thus the list 
U1,.--,Um,W1,---,Wn 


spans V. Applying the procedure of the proof of 2.31 to reduce this list to a 
basis of V produces a basis consisting of the vectors u1, ..., Um (none of the 
u’s get deleted in this procedure because u1, ..., um is linearly independent) 
and some of the w’s. m 
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As an example in F?, suppose we start with the linearly independent 
list (2,3, 4), (9,6, 8). If we take w1,w2,w3 in the proof above to be the 
standard basis of F?, then the procedure in the proof above produces the list 
(2,3, 4), (9, 6, 8), (0, 1,0), which is a basis of F°. 

As an application of the result above, 
we now show that every subspace of a 
finite-dimensional vector space can be 
paired with another subspace to form a 
direct sum of the whole space. 


Using the same basic ideas but |§ 


considerably more advanced tools, |) 
the next result can be proved with-|_ 
out the hypothesis that V is finite- 
dimensional. 


2.34 Every subspace of V is part of a direct sum equal to V 


Suppose V is finite-dimensional and U is a subspace of V. Then there is a 
subspace W of V such that V = U @ W. 


Proof Because V is finite-dimensional, so is U (see 2.26). Thus there is 
a basis u4,..., Um Of U (see 2.32). Of course u1,..., um is a linearly in- 
dependent list of vectors in V. Hence this list can be extended to a basis 
U1,.--,Um,W1,+--,Wn Of V (see 2.33). Let W = span(w1,..., Wn). 

To prove that V = U @ W, by 1.45 we need only show that 


V=U+W and UNW = {0}. 


To prove the first equation above, suppose v € V. Then, because the list 
U1,...,Um,W1,-..,Wn Spans V, there exist a1,...,dm,b1,...,bn € F such 
that 

v = au +--+ amum + bw, +--+ bnw. 
— S a— 
u w 
In other words, we have v = u + w, where u € U and w € W are defined as 
above. Thus v € U + W, completing the proof that V = U + W. 

To show that U N W = {0}, suppose v € U N W. Then there exist scalars 

1,...,Am,b1,...,b, € F such that 


v = d1U1 tes + amum = biwi +: + buwa. 


Thus 

d1uı1 + i + amum — biwi — ---— bnWn = 0. 
Because 1,...,WUm,W1,.--,Wn is linearly independent, this implies that 
ay =- = dm = bı = --- = by = 0. Thus v = 0, completing the proof 


that U N W = {0}. 7 
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EXERCISES 2.B 


1 Find all vector spaces that have exactly one basis. 
2 Verify all the assertions in Example 2.28. 


3 (a) Let U be the subspace of R° defined by 
U = {(x1, X2, X3, X4, X5) E€ R? : x1 = 3x2 and x3 = 7x4}. 


Find a basis of U. 
(b) Extend the basis in part (a) to a basis of R. 
(c) Find a subspace W of RÊ such that R° = U @ W. 


4 (a) Let U be the subspace of C? defined by 
U = {(21, 22, 23,24,25) € C? : 6z, = z2 and 23 +2744 3z5 = o}. 


Find a basis of U. 
(b) Extend the basis in part (a) to a basis of Ce, 
(c) Find a subspace W of CÊ such that C? = U @ W. 


5 Prove or disprove: there exists a basis po, p1, P2, p3 of P3(F) such that 
none of the polynomials po, P1, p2, p3 has degree 2. 


6 Suppose v1, v2, v3, v4 is a basis of V. Prove that 
Vi + V2, V2 + V3, V3 + V4, V4 
is also a basis of V. 


7 Prove or give a counterexample: If v1, v2, v3, v4 is a basis of V and U 
is a subspace of V such that vı, v2 € U and v3 ¢ U and v4 ¢ U, then 
v1, v2 is a basis of U. 


8 Suppose U and W are subspaces of V such that V = U @ W. Suppose 
also that u1,..., Um is a basis of U and wj,...,wy is a basis of W. 
Prove that 

U1,.--,Um,W1,--+5Wn 


is a basis of V. 
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2.C Dimension 


Although we have been discussing finite-dimensional vector spaces, we have 
not yet defined the dimension of such an object. How should dimension be 
defined? A reasonable definition should force the dimension of F” to equal n. 
Notice that the standard basis 


(1,0,...,0), (0,1,0,...,0),..., (0,...,0, 1) 


of F” has length n. Thus we are tempted to define the dimension as the length 
of a basis. However, a finite-dimensional vector space in general has many 
different bases, and our attempted definition makes sense only if all bases in a 
given vector space have the same length. Fortunately that turns out to be the 
case, as we now show. 


2.35 Basis length does not depend on basis 


Any two bases of a finite-dimensional vector space have the same length. 


Proof Suppose V is finite-dimensional. Let Bı and Bz be two bases of V. 
Then B; is linearly independent in V and B2 spans V, so the length of Bı is 
at most the length of Bz (by 2.23). Interchanging the roles of Bı and Bz, we 
also see that the length of Bz is at most the length of B1. Thus the length of 
Bı equals the length of B2, as desired. E 


Now that we know that any two bases of a finite-dimensional vector space 
have the same length, we can formally define the dimension of such spaces. 


2.36 Definition dimension, dim V 


e The dimension of a finite-dimensional vector space is the length of 
any basis of the vector space. 


e The dimension of V (if V is finite-dimensional) is denoted by dim V. 


2.37 Example dimensions 
e dim F” = n because the standard basis of F” has length n. 


e dim Pm(F) = m + 1 because the basis 1,z,...,2’ of Pm(F) has 
length m + 1. 
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Every subspace of a finite-dimensional vector space is finite-dimensional 
(by 2.26) and so has a dimension. The next result gives the expected inequality 
about the dimension of a subspace. 


2.38 Dimension of a subspace 
If V is finite-dimensional and U is a subspace of V, then dim U < dim V. 
Proof Suppose V is finite-dimensional and U is a subspace of V. Think of a 


basis of U as a linearly independent list in V, and think of a basis of V as a 
spanning list in V. Now use 2.23 to conclude that dim U < dim V. m 


To check that a list of vectors in V [77 2 a 
. : i e real vector space R^ has di- 
is a basis of V, we must, according to | mension 2; the complex vector 
the definition, show that the list in ques- | space C has dimension 1. As 
tion satisfies two properties: it must be | sets, R? can be identified with C 
linearly independent and it must span |(and addition is the same on both 
V. The next two results show that if the | spaces, as is scalar multiplication 
list in question has the right length, then |?” real numbers). Thus when we 
; : talk about the dimension of a vec-| 
we need only check that it satisfies one 
; : À tor space, the role played by the 
of the two required properties. First we | choice of F cannot be neglected. 
prove that every linearly independent = 
list with the right length is a basis. 


2.39 Linearly independent list of the right length is a basis 


Suppose V is finite-dimensional. Then every linearly independent list of 
vectors in V with length dim V is a basis of V. 


Proof Suppose dim V = n and v1,..., Vn is linearly independent in V. The 
list v1, . . . , Vn can be extended to a basis of V (by 2.33). However, every basis 
of V has length n, so in this case the extension is the trivial one, meaning that 
no elements are adjoined to v1,..., Vy. In other words, v1,..., vn is a basis 
of V, as desired. a 


2.40 Example Show that the list (5, 7), (4, 3) is a basis of F°. 


Solution This list of two vectors in F? is obviously linearly independent 
(because neither vector is a scalar multiple of the other). Note that F? has 
dimension 2. Thus 2.39 implies that the linearly independent list (5, 7), (4, 3) 
of length 2 is a basis of F? (we do not need to bother checking that it spans F7). 
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2.41 Example Show that 1, (x — 5)”, (x — 5)? is a basis of the subspace 
U of P3(R) defined by 


U = {p € P3(R) : p'(5) = 0}. 


Solution Clearly each of the polynomials 1, (x — 5)”, and (x — 5)? is in U. 
Suppose a,b,c € R and 


a+ b(x —5)? +c(x —5)? =0 


for every x € R. Without explicitly expanding the left side of the equation 
above, we can see that the left side has a cx? term. Because the right side has 
no x? term, this implies that c = 0. Because c = 0, we see that the left side 
has a bx? term, which implies that b = 0. Because b = c = 0, we can also 
conclude that a = 0. 

Thus the equation above implies that a = b = c = 0. Hence the list 
1, (x — 5)”, (x — 5)? is linearly independent in U. 

Thus dimU > 3. Because U is a subspace of P3(R), we know that 
dim U < dim P3(R) = 4 (by 2.38). However, dim U cannot equal 4, because 
otherwise when we extend a basis of U to a basis of ?3(R) we would get a 
list with length greater than 4. Hence dim U = 3. Thus 2.39 implies that the 
linearly independent list 1, (x — 5)”, (x — 5)? is a basis of U. 


Now we prove that a spanning list with the right length is a basis. 


2.42 Spanning list of the right length is a basis 


Suppose V is finite-dimensional. Then every spanning list of vectors in V 
with length dim V is a basis of V. 


Proof Suppose dim V = n and v1,..., vn spans V. The list v1,...,Vn can 
be reduced to a basis of V (by 2.31). However, every basis of V has length 
n, so in this case the reduction is the trivial one, meaning that no elements 
are deleted from v1,..., vn. In other words, v1,...,Vn is a basis of V, as 
desired. m 


The next result gives a formula for the dimension of the sum of two 
subspaces of a finite-dimensional vector space. This formula is analogous 
to a familiar counting formula: the number of elements in the union of two 
finite sets equals the number of elements in the first set, plus the number of 
elements in the second set, minus the number of elements in the intersection 
of the two sets. 
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2.43 Dimension of a sum 


If U; and U2 are subspaces of a finite-dimensional vector space, then 
dim(U; + U2) = dim U; + dim U2 — dim(U; N U2). 


Proof Let u1,...,Um be a basis of Uy N U2; thus dim(U; N U2) = m. Be- 
cause U1,..., Um is a basis of U1 N Uhn, it is linearly independent in U1. 
Hence this list can be extended to a basis u1,...,Um.V1,...,vj of Ui 
(by 2.33). Thus dimU; = m + j. Also extend u1,..., um to a basis 
U1, ..., Um, W1, ..., Wk Of U2; thus dim U2 = m + k. 

We will show that 

Ul,...,Um,V1,...,Vj,W1,..., Wk 
is a basis of Uj + U2. This will complete the proof, because then we will have 
=(m + j)+(m+ky—m 
= dim U; + dim U2 — dim(U; N U2). 

Clearly span(u1,...,Um,V1,---,Vj,W1,---,Wx) contains Uy and U2 and 
hence equals U1 + U2. So to show that this list is a basis of U1 + U2 we need 
only show that it is linearly independent. To prove this, suppose 

Quy +++ + amum + bivi +--+ + bjvj +ciwi +: + cRwE = 0, 
where all the a’s, b’s, and c’s are scalars. We need to prove that all the a’s, 
b’s, and c’s equal 0. The equation above can be rewritten as 

CyWy +++ + CREWE = —41U1 — +++ — AmUm — bivi — +++ — byvj, 
which shows that cywy + ++: + ckwg E Uy. All the w’s are in U2, so this 
implies that cywy +--+ + Ckwk € Uy N U2. Because u1,...,Um is a basis 
of U1 N U2, we can write 

cıwı tee + ckwk = diui +--+ + dyum 


for some choice of scalars dj,...,dm. But uy,..., Um,W1,...,W x is linearly 
independent, so the last equation implies that all the c’s (and d’s) equal 0. 
Thus our original equation involving the a’s, b’s, and c’s becomes 


Quy +--+ + amum + b1vı +- +bjvj =0. 


Because the list u1, ..., um, V1,- -, vj is linearly independent, this equation 
implies that all the a’s and b’s are 0. We now know that all the a’s, b’s, and 
c’s equal 0, as desired. m 
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EXERCISES 2.C 


10 


11 


Suppose V is finite-dimensional and U is a subspace of V such that 
dim U = dim V. Prove that U = V. 


Show that the subspaces of R? are precisely {0}, R?, and all lines in R? 
through the origin. 


Show that the subspaces of R? are precisely {0}, R3, all lines in R? 
through the origin, and all planes in R? through the origin. 

(a) LettU = {p € P4(F): p(6) = 0}. Find a basis of U. 

(b) Extend the basis in part (a) to a basis of P4(F). 

(c) Find a subspace W of P4(F) such that P4(F) = U @ W. 

(a) LetU = {p € P4(R): p”(6) = 0}. Find a basis of U. 

(b) Extend the basis in part (a) to a basis of P4(R). 

(c) Find a subspace W of P4(R) such that P4(R) = U @ W. 

(a) LetU = {p € Pa(F): p(2) = p(5)}. Find a basis of U. 

(b) Extend the basis in part (a) to a basis of P4(F). 

(c) Find a subspace W of P4(F) such that P4(F) = U @ W. 

(a) LetU = {p € Pa(F): p(2) = p(5) = p(6)}. Find a basis of U. 
(b) Extend the basis in part (a) to a basis of P4(F). 

(c) Find a subspace W of P4(F) such that P4(F) = U @ W. 

(a) LetU = {p € P,(R): ie p = 0}. Find a basis of U. 

(b) Extend the basis in part (a) to a basis of P4(R). 

(c) Find a subspace W of P4(R) such that P4(R) = U @ W. 


Suppose v1,..., Vm is linearly independent in V and w € V. Prove that 


dim span(vı + w,...,¥m +W) >m-—1. 


Suppose po, P1,---; Pm € P(F) are such that each p; has degree j. 
Prove that po, P1,..-,; Pm is a basis of Pm (F). 


Suppose that U and W are subspaces of R® such that dimU = 3, 
dim W = 5, and U + W = RÈ. Prove that RÌ = U @ W. 


12 


13 


14 


15 


16 


17 
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Suppose U and W are both five-dimensional subspaces of R?. Prove 
that U N W Z {0}. 


Suppose U and W are both 4-dimensional subspaces of CÊ. Prove that 
there exist two vectors in U N W such that neither of these vectors is a 
scalar multiple of the other. 


Suppose U41, ...,Um are finite-dimensional subspaces of V. Prove that 
U; +---+ Um is finite-dimensional and 


dim(U +--+- + Um) < dim U1 +--+ + dim Um. 


Suppose V is finite-dimensional, with dim V = n > 1. Prove that there 
exist 1-dimensional subspaces U1,..., Un of V such that 


V =U: DUn. 


Suppose Uj,..., Um are finite-dimensional subspaces of V such that 
Uı +---+ Um is a direct sum. Prove that U; © --- ® Um is finite- 
dimensional and 


dim U1 @® --- ® Um = dim U1 + ---+ dim Um. 


[The exercise above deepens the analogy between direct sums of sub- 
spaces and disjoint unions of subsets. Specifically, compare this exercise 
to the following obvious statement: if a set is written as a disjoint union 
of finite subsets, then the number of elements in the set equals the sum of 
the numbers of elements in the disjoint subsets. 


You might guess, by analogy with the formula for the number of ele- 
ments in the union of three subsets of a finite set, that if U1, U2, U3 are 
subspaces of a finite-dimensional vector space, then 


dim(U, + U2 + U3) 
= dim U; + dim U2 + dim U3 
— dim(U1 N U2) — dim(U; N U3) — dim(U2 N U3) 
+ dim(U; N U2 N U3). 


Prove this or give a counterexample. 


CHAPTER 


German mathematician Carl 
Friedrich Gauss (1777-1855), who 
in 1809 published a method for 
solving systems of linear equations. 
This method, now called Gaussian 
elimination, was also used in a 
Chinese book published over 1600 
years earlier. 


Linear Maps 


So far our attention has focused on vector spaces. No one gets excited about 
vector spaces. The interesting part of linear algebra is the subject to which we 
now turn—linear maps. 

In this chapter we will frequently need another vector space, which we will 
call W, in addition to V. Thus our standing assumptions are now as follows: 


3.1 Notation F,V,W 


e F denotes R or C. 


e V and W denote vector spaces over F. 


LEARNING OBJECTIVES FOR THIS CHAPTER 
m Fundamental Theorem of Linear Maps 
m the matrix of a linear map with respect to given bases 
m isomorphic vector spaces 
m product spaces 
m quotient spaces 
m the dual space of a vector space and the dual of a linear map 
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3.A | The Vector Space of Linear Maps 


Definition and Examples of Linear Maps 


Now we are ready for one of the key definitions in linear algebra. 


3.2 Definition linear map 
A linear map from V to W is a function T: V —> W with the following 
properties: 
additivity 
T(u+v)= Tu +Tvforalu,ve V; 


homogeneity 
T (Av) = A (Tv) for all A € F and all v € V. 


Some mathemanaens use the Note that for linear maps we often 
term linear transformation, which use the notation Tv as well as the more 
means the same as linear map. standard functional notation T (v). 


3.3 Notation L(V,W) 
The set of all linear maps from V to W is denoted L(V, W). 


Let’s look at some examples of linear maps. Make sure you verify that 
each of the functions defined below is indeed a linear map: 


3.4 Example linear maps 


Zero 
In addition to its other uses, we let the symbol 0 denote the function that takes 
each element of some vector space to the additive identity of another vector 
space. To be specific, 0 € L(V, W) is defined by 

Ov = 0. 
The 0 on the left side of the equation above is a function from V to W, whereas 
the 0 on the right side is the additive identity in W. As usual, the context 
should allow you to distinguish between the many uses of the symbol 0. 


identity 
The identity map, denoted I, is the function on some vector space that takes 
each element to itself. To be specific, J € L(V, V) is defined by 

Iv=v. 
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differentiation 
Define D € L(P(R), P(R)) by 
Dp = p'. 
The assertion that this function is a linear map is another way of stating a basic 


result about differentiation: (f + gY = f’ + g’ and (Af Y = Af’ whenever 
f, g are differentiable and A is a constant. 


integration 
Define T € L(P(R), R) by 


1 
T=] p(x) dx. 
0 


The assertion that this function is linear is another way of stating a basic result 
about integration: the integral of the sum of two functions equals the sum 
of the integrals, and the integral of a constant times a function equals the 
constant times the integral of the function. 


multiplication by x? 
Define T € L(P(R), P(R)) by 
(Tp)(x) = x? p(x) 
forx ER. 
backward shift 


Recall that F° denotes the vector space of all sequences of elements of F. 
Define T € L(F™, F°) by 


T (x1, X2, X3,...) = (X2, X3,...). 
from R° to R? 
Define T € £(R?, R?) by 
T(x, y,Z) = (2x — y + 3z, 7x + 5y — 62). 
from F” to F” 


Generalizing the previous example, let m and n be positive integers, let 
A;x EF for j= 1,...,m andk =1,...,n, and define T € L(F”, F”) by 


T (x1; Xn) = (A1,1xX1 ats ef AinXn, tees Am,1X1 ai Am,nXn). 
Actually every linear map from F” to F” is of this form. 
The existence part of the next result means that we can find a linear map 
that takes on whatever values we wish on the vectors in a basis. The uniqueness 


part of the next result means that a linear map is completely determined by its 
values on a basis. 
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3.5 Linear maps and basis of domain 


Suppose v1,..., Vn is a basis of V and w1,...,w» E€ W. Then there exists 
a unique linear map T: V — W such that 


Tv; = wj 
foreach j = 1,...,n. 


Proof First we show the existence of a linear map T with the desired property. 
Define T: V > W by 


T (civi + +++ + Cnn) = C1W1 +++ + CnWn, 


where ¢],..., Cn are arbitrary elements of F. The list v1,...,vy, is a basis 
of V, and thus the equation above does indeed define a function T from 
V to W (because each element of V can be uniquely written in the form 
Civi oes + CnYn). 
For each j, taking c; = 1 and the other c’s equal to 0 in the equation 
above shows that Tv; = wj. 
Ifu,v € V with u = aivi +--+ + dnVn and v = cyvy +--+: + CnYn, then 
T(u +v) = T((a1 + c1)vı +--+ + (an + cnWn) 
= (a1 + c1)w1 +: + (an + Cn)Wn 
= (aw +-+: + anWn) + (ciw +--+ CnWn) 
= Tu + Tv. 


Similarly, if A € F and v = c1ıvı +--+ + Cnvn, then 


T (àv) = T (àcıvı +++: + Àcnvn) 
= Acyw1 +++- + ÀCnWn 
= A(cyw +--+ + Cnwn) 
= XTVv. 


Thus T is a linear map from V to W. 

To prove uniqueness, now suppose that T € L(V, W) and that Tv; = wj 
for j = 1,...,n. Let c1,...,Cn E€ F. The homogeneity of T implies that 
T(cjvj) = cjw; for j = 1,...,n. The additivity of T now implies that 


T(civi + +++ + Can) = C1W1 + +++ + CnWn. 


Thus T is uniquely determined on span(v1,..., vn) by the equation above. 
Because vj,..., Vn is a basis of V, this implies that T is uniquely determined 
on V. E 
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Algebraic Operations on L(V, W) 
We begin by defining addition and scalar multiplication on L(V, W). 


3.6 Definition addition and scalar multiplication on L(V, W) 


Suppose S,T € L(V,W) and A € F. The sum S + T and the product 
AT are the linear maps from V to W defined by 


(S+T)(v) =Sv+Tv and (AT)(v) = A(Tyv) 
for all v € V. 


You should verify that S +T and AT Although linear maps are perva-l§ 
as defined above are indeed linear maps. | sive throughout mathematics, they 
In other words, if S,T € L(V, W) and | are not as ubiquitous as imagined 


à € F, then S+ T €e L(V,W) and | by some confused students who 
AT € L(V, W). seem to think that cos is a linear 


Because we took the trouble to de- |™4P rom R to R when they write 
that cos 2x equals 2. cos x and that 


fine addition and scalar multiplication 

cos(x + y) equals cos x + cos y. 
on L(V, W), the next result should not h J 
be a surprise. 


3.7 L(V, W) is a vector space 


With the operations of addition and scalar multiplication as defined above, 
L(V, W) is a vector space. 


The routine proof of the result above is left to the reader. Note that the 
additive identity of L(V, W) is the zero linear map defined earlier in this 
section. 

Usually it makes no sense to multiply together two elements of a vector 
space, but for some pairs of linear maps a useful product exists. We will need 
a third vector space, so for the rest of this section suppose U is a vector space 
over F. 


3.8 Definition Product of Linear Maps 


If T € L(U,V) and S € L(V, W), then the product ST € L(U, W) is 
defined by 
(ST)(u) = S(Tu) 


for u € U. 
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In other words, ST is just the usual composition S o T of two functions, 
but when both functions are linear, most mathematicians write ST instead 
of S o T. You should verify that ST is indeed a linear map from U to W 
whenever T € L(U, V) and S € L(V, W). 

Note that ST is defined only when T maps into the domain of S. 


3.9 Algebraic properties of products of linear maps 


associativity 
(TiT2)T3 = Tı (T2T3) 


whenever Tı, T2, and T3 are linear maps such that the products make 
sense (meaning that 73 maps into the domain of T2, and Tz maps into the 
domain of T). 


identity 
II Silt = 


whenever T € L(V, W) (the first J is the identity map on V, and the 
second J is the identity map on W). 


distributive properties 
(S; + $2)T =S8,;T + S2T and S(T, + To) = ST, + ST» 
whenever T, Tı, 72 E€ L(U, V) and S, S1, S2 E€ L(V, W). 
The routine proof of the result above is left to the reader. 


Multiplication of linear maps is not commutative. In other words, it is not 
necessarily true that ST = TS, even if both sides of the equation make sense. 


3.10 Example Suppose D € L(PR), P(R)) is the differentiation map 
defined in Example 3.4 and T € L(P(R), P(R)) is the multiplication by x? 
map defined earlier in this section. Show that TD # DT. 


Solution We have 
((TD)p)(x) = x*p'(x) but ((DT)p)(x) = x7 p'(x) + 2xp(x). 


In other words, differentiating and then multiplying by x? is not the same as 
multiplying by x? and then differentiating. 
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3.11 Linear maps take 0 to 0 


Suppose T is a linear map from V to W. Then 7(0) = 0. 


Proof By additivity, we have 
PO) = T(0 + 0) = TO) + TO). 


Add the additive inverse of T (0) to each side of the equation above to conclude 
that 7(0) = 0. m 


EXERCISES 3.A 


1 Suppose b,c € R. Define T : R? — R? by 
T(x, y,Z) = (2x — 4y + 3z + b, 6x + cxyz). 
Show that T is linear if and only if b = c = 0. 


2 Suppose b,c € R. Define T: P(R) —> R? by 


2 
Tp = (3p(4) + 5p’(6) + bp(1) p(2), L x? p(x) dx + c sin p(0)). 


Show that T is linear if and only if b = c = 0. 


3 Suppose T € L(F”,F™”). Show that there exist scalars A; € F for 
Jo= Tyas m and k = 1,...,n such that 


Vis tp) = (A, 1x1 ae -+ Ai nXn, tees Am,1X1 pu -+ Ám,nXn) 


for every (x1, ..., Xn) € F”. 
[The exercise above shows that T has the form promised in the last item 
of Example 3.4.] 


4 Suppose T € L(V, W) and v1,...,vm is a list of vectors in V such that 
Tv1,..., 7 Vm is a linearly independent list in W. Prove that v1, ..., Vm 
is linearly independent. 


5 Prove the assertion in 3.7. 


6 Prove the assertions in 3.9. 


10 


11 


12 


13 


14 


CHAPTER 3 Linear Maps 


Show that every linear map from a 1-dimensional vector space to itself is 
multiplication by some scalar. More precisely, prove that if dim V = 1 
and T € L(V,V), then there exists A € F such that Tv = Av for all 
ve. 


Give an example of a function g: R? — R such that 


plav) = api) 


for all a € R and all v € R? but ¢ is not linear. 
[The exercise above and the next exercise show that neither homogeneity 
nor additivity alone is enough to imply that a function is a linear map.] 


Give an example of a function g: C — C such that 


p(w + z) = p(w) + (z2) 


for all w, z € C but ¢ is not linear. (Here C is thought of as a complex 
vector space.) 

[There also exists a function o: R — R such that @ satisfies the additiv- 
ity condition above but ọ is not linear. However, showing the existence 
of such a function involves considerably more advanced tools. | 


Suppose U is a subspace of V with U Æ V. Suppose S$ € L(U, W) and 
S # 0 (which means that Su +Æ 0 for some u € U). Define T: V > W 
by 
Sv if veu, 
Tv= 
0 if ve Vand vél. 


Prove that T is not a linear map on V. 


Suppose V is finite-dimensional. Prove that every linear map on a 
subspace of V can be extended to a linear map on V. In other words, 
show that if U is a subspace of V and S € L(U, W), then there exists 
T € L(V, W) such that Tu = Su for all u € U. 


Suppose V is finite-dimensional with dimV > 0, and suppose W is 
infinite-dimensional. Prove that L(V, W) is infinite-dimensional. 


Suppose v1,..., Vm is a linearly dependent list of vectors in V. Suppose 
also that W Æ {0}. Prove that there exist w1,...,Wm E W such that no 
T € L(V, W) satisfies Tvg = wz foreach k = 1,...,m. 


Suppose V is finite-dimensional with dim V > 2. Prove that there exist 
S,T € L(V, V) such that ST 4 TS. 
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3.B Null Spaces and Ranges 


Null Space and Injectivity 


In this section we will learn about two subspaces that are intimately connected 
with each linear map. We begin with the set of vectors that get mapped to 0. 


3.12 Definition null space, null T 


For T € L(V, W), the null space of T, denoted null T, is the subset of V 
consisting of those vectors that T maps to 0: 


null T = {ve V : Tv = 0}. 


3.13 Example null space 


e If T is the zero map from V to W, in other words if Tv = 0 for every 
v € V, then null T = V. 


Suppose gy € L(C?, F) is defined by (z1, Z2, Z3) = Z1 + 222 + 373. 
Then nullo = {(21, 22,23) € C? : z1 +222 + 3z3 = 0}. A basis of 
null g is (—2, 1, 0), (—3, 0, 1). 


Suppose D € L(P(R), P(R)) is the differentiation map defined by 
Dp = p’. The only functions whose derivative equals the zero function 
are the constant functions. Thus the null space of D equals the set of 
constant functions. 


Suppose T € L(P (R), PR)) is the multiplication by x? map defined 
by (Tp)(x) = x? p(x). The only polynomial p such that x? p(x) = 0 
for all x € R is the 0 polynomial. Thus null T = {0}. 


Suppose T € L(F°, F°) is the backward shift defined by 
T (x1, X2, X3, ee 2) = (X2, X3, E .). 


Clearly T (x1, X2, X3, ...) equals 0 if and only if x2, x3,... are all 0. 
Thus in this case we have null T = {(a,0,0,...): a € E}. 


The next result shows that the null {Some mathematicians use the term | 
space of each linear map is a subspace | kernel instead of null space. The 
of the domain. In particular, 0 is in the | word “null” means zero. Thus the 


null space of every linear map. term “null space” should remind 
you of the connection to 0. 
l 
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3.14 The null space is a subspace 
Suppose T € L(V, W). Then null T is a subspace of V. 


Proof Because T is a linear map, we know that T (0) = 0 (by 3.11). Thus 


0 € null T. 
Suppose u,v € null T. Then 


T(u+v)=Tu+Tv=0+0=0. 


Hence u + v € null T. Thus null T is closed under addition. 
Suppose u € null T and À € F. Then 


T(Au) = ATu = 10 = 0. 


Hence Au € null T. Thus null T is closed under scalar multiplication. 

We have shown that null T contains 
0 and is closed under addition and scalar 
multiplication. Thus null T is a sub- 
space of V (by 1.34). E 


Take another look at the null spaces 
that were computed in Example 
3.13 and note that all of them are 
subspaces. 


As we will soon see, for a linear map 
the next definition is closely connected to the null space. 


3.15 Definition injective 


A function T: V — W is called injective if Tu = Tv implies u = v. 


Many mathematicians use the term The definition above could be 
one-to-one, which means the same rephrased to say that T is injective if 
as injective. u Æ v implies that Tu +Æ Tv. In other 
words, T is injective if it maps distinct 
inputs to distinct outputs. 

The next result says that we can check whether a linear map is injective 
by checking whether 0 is the only vector that gets mapped to 0. As a simple 
application of this result, we see that of the linear maps whose null spaces we 
computed in 3.13, only multiplication by x? is injective (except that the zero 
map is injective in the special case V = {0}). 
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3.16 Injectivity is equivalent to null space equals {0} 
Let T € L(V, W). Then T is injective if and only if null T = {0}. 


Proof First suppose T is injective. We want to prove that null T = {0}. We 
already know that {0} C null T (by 3.11). To prove the inclusion in the other 
direction, suppose v € null T. Then 


T(v) =0=T(0). 


Because T is injective, the equation above implies that v = 0. Thus we can 
conclude that null T = {0}, as desired. 

To prove the implication in the other direction, now suppose null T = {0}. 
We want to prove that T is injective. To do this, suppose u,v € V and 
Tu = Tv. Then 

0=Tu-Tv=T(u-y). 
Thus u — v is in null T, which equals {0}. Hence u — v = 0, which implies 
that u = v. Hence T is injective, as desired. n 


Range and Surjectivity 


Now we give a name to the set of outputs of a function. 


3.17 Definition range 


For T a function from V to W, the range of T is the subset of W consisting 
of those vectors that are of the form Tv for some v € V: 


range T = {Tv: ve V}. 


3.18 Example range 


e If T is the zero map from V to W, in other words if Tv = 0 for every 
v € V, then range T = {0}. 


e Suppose T € L(R?,R°) is defined by T(x, y) = (2x,5y,x + y), 
then range T = {(2x, 5y, x + y): x,y € R}. A basis of range T is 
(2,0, 1), (0,5,1). 


e Suppose D € L(P(R), PR)) is the differentiation map defined by 
Dp = p’. Because for every polynomial q € P(R) there exists a 
polynomial p € P(R) such that p’ = q, the range of D is P(R). 
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Some mathematicians use the word | The next result shows that the range 
image, which means the same as of each linear map is a subspace of 
range. | the vector space into which it is being 
~ — mapped. 


3.19 The range is a subspace 
If T € L(V, W), then range T is a subspace of W. 


Proof Suppose T € L(V, W). Then T(0) = 0 (by 3.11), which implies that 
0 € range T. 

If w1, w2 € range T, then there exist v1, v2 € V such that Tv, = w, and 
Tv2 = w2. Thus 


Tvi + v2) = Tvi + Tv2 = wı + w2. 


Hence w; + w2 € range T. Thus range T is closed under addition. 
If w € range T and A € F, then there exists v € V such that Tv = w. 
Thus 
T (àv) =ATv = àw. 


Hence Aw € range T. Thus range T is closed under scalar multiplication. 
We have shown that range T contains 0 and is closed under addition and 
scalar multiplication. Thus range T is a subspace of W (by 1.34). E 


3.20 Definition surjective 


A function T: V — W is called surjective if its range equals W. 


To illustrate the definition above, note that of the ranges we computed in 
3.18, only the differentiation map is surjective (except that the zero map is 
surjective in the special case W = {0}. 


Many mathematicians use the term Ñ Whether a linear map is surjective 
onto, which means the same as sur- depends on what we are thinking of as 
jective. the vector space into which it maps. 


3.21 Example The differentiation map D € L(Ps (R), P5 (R)) defined 
by Dp = p' is not surjective, because the polynomial x” is not in the range 
of D. However, the differentiation map S € L(P5(R), P4(R)) defined by 
Sp = p’ is surjective, because its range equals P4(R), which is now the 
vector space into which S maps. 
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Fundamental Theorem of Linear Maps 


The next result is so important that it gets a dramatic name. 


3.22 Fundamental Theorem of Linear Maps 


Suppose V is finite-dimensional and T € L(V,W). Then range T is 
finite-dimensional and 


dim V = dimnull T + dim range T. 


Proof Letu1,..., um bea basis of null 7; thus dim null T = m. The linearly 
independent list u1, ..., um can be extended to a basis 
uUl,...,Um,V1,---,Vn 


of V (by 2.33). Thus dim V = m + n. To complete the proof, we need only 
show that range T is finite-dimensional and dim range T = n. We will do this 
by proving that Tv1, ..., Tvp is a basis of range T. 

Let v € V. Because u1, ...,uUm,V1,...,Vn spans V, we can write 


v = d1U1 ++- + amum + bivi +--+ + bnyn, 


where the a’s and b’s are in F. Applying T to both sides of this equation, we 
get 
Tv = bıTvi +--+ bnTvn, 


where the terms of the form Tu; disappeared because each u; is in null T. 


The last equation implies that Tv1,..., Tvn spans range T. In particular, 
range T is finite-dimensional. 

To show Tv1,..., Tvn is linearly independent, suppose c1,...,Cn € F 
and 


ciTvi +--+ cenT vy, = 0. 


Then 

T (civi +++: + Cnvn) = 0. 
Hence 

C1vi +++ + CnVn E null T. 
Because u1, ..., um spans null T, we can write 


Civi tet + CnYn = diui +++ + dmum, 


where the d’s are in F. This equation implies that all the c’s (and d’s) are 0 
(because u1,..., Um,V1,---,Vn is linearly independent). Thus Tv1,..., Tvn 
is linearly independent and hence is a basis of range T, as desired. m 
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Now we can show that no linear map from a finite-dimensional vector 
space to a “smaller” vector space can be injective, where “smaller” is measured 
by dimension. 


3.23 A map to a smaller dimensional space is not injective 


Suppose V and W are finite-dimensional vector spaces such that 
dim V > dim W. Then no linear map from V to W is injective. 


Proof Let T € L(V,W). Then 


dim null T = dim V — dim range T 
> dim V — dim W 


> 0, 


where the equality above comes from the Fundamental Theorem of Linear 
Maps (3.22). The inequality above states that dim null T > 0. This means 
that null T contains vectors other than 0. Thus T is not injective (by 3.16). m 


The next result shows that no linear map from a finite-dimensional vector 
space to a “bigger” vector space can be surjective, where “bigger” is measured 
by dimension. 


3.24 A map toa larger dimensional space is not surjective 


Suppose V and W are finite-dimensional vector spaces such that 
dim V < dim W. Then no linear map from V to W is surjective. 


Proof Let T € L(V, W). Then 


dim range T = dim V — dim null T 
< dim V 
< dim W, 


where the equality above comes from the Fundamental Theorem of Linear 
Maps (3.22). The inequality above states that dim range T < dim W. This 
means that range T cannot equal W. Thus T is not surjective. 7 


As we will now see, 3.23 and 3.24 have important consequences in the 
theory of linear equations. The idea here is to express questions about systems 
of linear equations in terms of linear maps. 
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3.25 Example  Rephrase in terms of a linear map the question of whether 
a homogeneous system of linear equations has a nonzero solution. 


Solution 

Fix positive integers m and n, and let Homogeneous, in this context, 
A;k € F for j = 1,...,m and | means that the constant term on the 
k =1,...,n. Consider the homoge- | right side of each equation below 
neous system of linear equations is 0. 


È 


n 
> AL kXk =0 
k=1 


n 
p> Am,kXk = 0. 
k=1 


Obviously x; = +--+ = Xn = 0 is a solution of the system of equations above; 
the question here is whether any other solutions exist. 
Define T : F” —> F” by 


n n 
T(xı, aii Na) = ‘o> AL kX, Sek Am,kXk). 
k=1 k=1 


The equation T (x1, ...,Xn) = 0 (the 0 here is the additive identity in F”, 
namely, the list of length m of all 0’s) is the same as the homogeneous system 
of linear equations above. 

Thus we want to know if null T is strictly bigger than {0}. In other words, 
we can rephrase our question about nonzero solutions as follows (by 3.16): 
What condition ensures that T is not injective? 


3.26 Homogeneous system of linear equations 


A homogeneous system of linear equations with more variables than 
equations has nonzero solutions. 


Proof Use the notation and result from the example above. Thus T is a 
linear map from F” to F”, and we have a homogeneous system of m linear 
equations with n variables x;,...,X,. From 3.23 we see that T is not injective 
ifn >m. 7 


Example of the result above: a homogeneous system of four linear equa- 
tions with five variables has nonzero solutions. 
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3.27 Example Consider the question of whether an inhomogeneous sys- 
tem of linear equations has no solutions for some choice of the constant terms. 
Rephrase this question in terms of a linear map. 


Solution Fix positive integers m and n, and let A; € F for j =1,...,m 
andk = 1,...,n. Forci,...,Cm E F, consider the system of linear equations 


n 
x Ai ,kXk = C1 
k=1 


3.28 
n 
» Am 5X = Cm. 
k=1 


The question here is whether there is some choice of c1,...,Cm € F such that 
no solution exists to the system above. 
Define T : F” —> F” by : z 
T(X1,...,Xn) = (3 AL kXky +++ > Am,kXk)- 
k=1 k=1 
The equation T (x1, ..., Xn) = (c1, .. -, Cm) is the same as the system of equa- 
tions 3.28. Thus we want to know if range T # F”. Hence we can rephrase 


our question about not having a solution for some choice of c1,...,Cm € F 
as follows: What condition ensures that T is not surjective? 


3.29 Inhomogeneous system of linear equations 


An inhomogeneous system of linear equations with more equations than 
variables has no solution for some choice of the constant terms. 


Our results about homogeneous } Proof Use the notation and result from 


systems with more variables than the example above. Thus T is a lin- 
equations and inhomogeneous sys- ear map from F” to F”, and we have a 
tems with more equations than vari- system of m equations with n variables 
ables (3.26 and 3.29) are often X1,...,Xy. From 3.24 we see that T is 


proved using Gaussian elimination. 
The abstract approach taken here 


leads to cleaner proofs. | Example of the result above: an 
i 7 inhomogeneous system of five linear 
equations with four variables has no solution for some choice of the con- 
stant terms. 


not surjective if n < m. E 
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EXERCISES 3.B 


1 Give an example of a linear map T such that dimnull7 = 3 and 
dimrange T = 2. 


2 Suppose V is a vector space and S, T € L(V, V) are such that 
range S C null T. 


Prove that (ST)? = 0. 


3 Suppose v1,..., Vm is alist of vectors in V. Define T € L(F”, V) by 


T(Zi,..., Zm) = Z1v1 +*+: + ZmVm- 
(a) | What property of T corresponds to v1, ..., Vm spanning V? 
(b) What property of T corresponds to v1,...,Vm being linearly 
independent? 
4 Show that 


{T e L(RÎ, RÍ) : dimnull T > 2} 
is not a subspace of L (RÊ, R4). 


5 Give an example of a linear map T : R* —> R* such that 


range T = null T. 


6 Prove that there does not exist a linear map T : R3 > R such that 


range T = null T. 


7 Suppose V and W are finite-dimensional with 2 < dim V < dim W. 
Show that {T e L(V,W) : T is not injective} is not a subspace of 
L(V, W). 


8 Suppose V and W are finite-dimensional with dim V > dim W > 2. 
Show that {T € L(V,W) : T is not surjective} is not a subspace of 
L(V, W). 


9 Suppose T € L(V, W) is injective and v1, .. . , vn is linearly independent 
in V. Prove that Tv1, ..., Tvp is linearly independent in W. 
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Suppose v1,...,Vn spans V and T € L(V,W). Prove that the list 
Tv1,..., 7 Vn spans range T. 


Suppose Sj,..., Sn are injective linear maps such that S1 S2--- Sn 
makes sense. Prove that $1 S2--- Sn is injective. 


Suppose that V is finite-dimensional and that T € L(V,W). Prove 
that there exists a subspace U of V such that U N null7 = {0} and 
range T = {Tu:u E€ U}. 


Suppose T is a linear map from F* to F? such that 
null T = {(x1, x2, X3, X4) € F4 : xy = 5x2 and x3 = 7x4}. 
Prove that T is surjective. 


Suppose U is a 3-dimensional subspace of R® and that T is a linear map 
from R® to RÊ such that null T = U. Prove that T is surjective. 


Prove that there does not exist a linear map from F° to F? whose null 
space equals 


{(x1, X2, X3, X4, X5) € F° : xy = 3x2 and x3 = x4 = x5}. 


Suppose there exists a linear map on V whose null space and range are 
both finite-dimensional. Prove that V is finite-dimensional. 


Suppose V and W are both finite-dimensional. Prove that there exists an 
injective linear map from V to W if and only if dimV < dim W. 


Suppose V and W are both finite-dimensional. Prove that there exists a 
surjective linear map from V onto W if and only if dim V > dim W. 


Suppose V and W are finite-dimensional and that U is a subspace of V. 
Prove that there exists T € L(V, W) such that null T = U if and only if 
dim U > dim V — dim W. 


Suppose W is finite-dimensional and T € L(V, W). Prove that T is 
injective if and only if there exists S € L(W,V) such that ST is the 
identity map on V. 


Suppose V is finite-dimensional and T € L(V,W). Prove that T is 
surjective if and only if there exists S € L(W, V) such that TS is the 
identity map on W. 
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Suppose U and V are finite-dimensional vector spaces and S € L(V, W) 
and T € L(U, V). Prove that 


dim null ST < dim null S + dim null T. 


Suppose U and V are finite-dimensional vector spaces and S € L(V, W) 
and T € £L(U, V). Prove that 


dimrange ST < min{dim range S, dimrange T}. 


Suppose W is finite-dimensional and Tı, T2 € L(V,W). Prove that 
null T} C null T, if and only if there exists S$ € L(W,W) such that 
Tə = ST}. 


Suppose V is finite-dimensional and Tı, T2 € L(V,W). Prove that 
range T} C range T, if and only if there exists S € L(V, V) such that 
T = 72S. 


Suppose D € L(P(R), P(R)) is such that deg Dp = (deg p) — 1 for 
every nonconstant polynomial p € P(R). Prove that D is surjective. 
[The notation D is used above to remind you of the differentiation map 
that sends a polynomial p to p'. Without knowing the formula for the 
derivative of a polynomial (except that it reduces the degree by 1), you 
can use the exercise above to show that for every polynomial q € P(R), 
there exists a polynomial p € P(R) such that p' = q.] 


Suppose p € P(R). Prove that there exists a polynomial q € ?(R) such 
that 5q” + 3q’ = p. 

[This exercise can be done without linear algebra, but it’s more fun to do 
it using linear algebra. | 


Suppose T € L(V, W), and w1,...,Wm is a basis of range T. Prove that 
there exist 91, ..., Øm E L(V, FE) such that 


Tv = gı (v)w1 +--+ + Gm(V)Wm 
for every v € V. 
Suppose o € L(V, F). Suppose u € V is not in null ø. Prove that 
V =nuillg È {au :a € F}. 


Suppose gı and g2 are linear maps from V to F that have the same null 
space. Show that there exists a constant c € F such that g1 = c2. 


Give an example of two linear maps 7) and Tz from R° to R? that have 
the same null space but are such that T; is not a scalar multiple of T2. 
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3.C Matrices 


Representing a Linear Map by a Matrix 


We know that if v1,..., Vpn is a basis of V and T: V — W is linear, then the 
values of Tv1,..., Tv, determine the values of T on arbitrary vectors in V 
(see 3.5). As we will soon see, matrices are used as an efficient method of 
recording the values of the Tv ;’s in terms of a basis of W. 


3.30 Definition matrix, A; k 


Let m and n denote positive integers. An m-by-n matrix A is a rectangular 
array of elements of F with m rows and n columns: 

Add ooa Aliyo 

a=| : 
Am, --- Aman 

The notation A ;,, denotes the entry in row j, column k of A. In other 


words, the first index refers to the row number and the second index refers 
to the column number. 


Thus A2,3 refers to the entry in the second row, third column of a matrix A. 


8 4 5—-3i 


3.31 Example a=) 9 - 


). then A23 = 


Now we come to the key definition in this section. 


3.32 Definition matrix of a linear map, M (T) 


Suppose T € L(V, W) and v1,..., vy, is a basis of V and w1,...,Wm iS 
a basis of W. The matrix of T with respect to these bases is the m-by-n 
matrix M (T) whose entries A; are defined by 


Tvk = Ay gw +++: + Ám, kWm- 


If the bases are not clear from the context, then the notation 
M(T, (11,...,Vn), (W1,-- .,Wm)) is used. 


The matrix M(T) of a linear map T € L(V,W) depends on the basis 
Vi,- Vn Of V and the basis w1, ..., Wm of W, as well as on T. However, the 
bases should be clear from the context, and thus they are often not included in 
the notation. 
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To remember how M (T) is constructed from T, you might write across 


the top of the matrix the basis vectors v1, ... , Vn for the domain and along the 
left the basis vectors w1,...,Wm for the vector space into which T maps, as 
follows: 
Vy... Vk wee Vn 
Wi Aix 
M(T) = 
Wm Am k 


In the matrix above only the kn col- The k* column of M(T) con- 
umn is shown. Thus the second index sists of the scalars needed to write 
of each displayed entry of the matrix |Tv as a linear combination of 
above is k. The picture above should |(w1,..., Wm): 

m 


Tvk = 5 Aj kWj. 
j=l 


remind you that Tvg can be computed 
from M(T) by multiplying each entry 
in the k column by the correspond- 
ing w; from the left column, and then 
adding up the resulting vectors. 

If T is a linear map from F” to F”, If T maps an n-dimensional vector 

then unless stated otherwise, assume the | space to an m-dimensional vector 
bases in question are the standard ones | space, then M(T) is an m-by-n 
(where the k" basis vector is 1 in the | matrix. 
k" slot and 0 in all the other slots). If 
you think of elements of F” as columns 
of m numbers, then you can think of the 
k™ column of M(T) as T applied to 
the k standard basis vector. 


3.33 Example Suppose T € £(F7, F°) is defined by 
T(x, y) = (x + 3y, 2x + Sy, 7x + 9y). 


Find the matrix of T with respect to the standard bases of F? and F°. 


Solution Because 7(1,0) = (1,2,7) and 7(0, 1) = (3,5, 9), the matrix of 
T with respect to the standard bases is the 3-by-2 matrix below: 


1 3 
M(T)=|[ 2 5 
7 9 
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When working with Pm (F), use the standard basis 1, x, x7,...,x™ unless 
the context indicates otherwise. 


3.34 Example Suppose D € £L(P3(R), P2 (R)) is the differentiation map 
defined by Dp = p’. Find the matrix of D with respect to the standard bases 
of P3 (R) and P2 (R). 


Solution Because (x”) = nx"—!, the matrix of T with respect to the 
standard bases is the 3-by-4 matrix below: 


0 1 
M(D)=]| 0 0 
0 0 


oN oO 
noe 


Addition and Scalar Multiplication of Matrices 


For the rest of this section, assume that V and W are finite-dimensional and 
that a basis has been chosen for each of these vector spaces. Thus for each 
linear map from V to W, we can talk about its matrix (with respect to the 
chosen bases, of course). Is the matrix of the sum of two linear maps equal to 
the sum of the matrices of the two maps? 

Right now this question does not make sense, because although we have 
defined the sum of two linear maps, we have not defined the sum of two 
matrices. Fortunately, the obvious definition of the sum of two matrices has 
the right properties. Specifically, we make the following definition. 


3.35 Definition matrix addition 


The sum of two matrices of the same size is the matrix obtained by adding 
corresponding entries in the matrices: 


Ai AES Alun Cii cere Cin 
Fr ; : 
Am,1 eee Amn Cm,1 see Cm,n 
A a Ci a Aig Cin 
Am,1 F Cm,1 e. Am,n T Cm,n 


In other words, (A + C)j,4 = Ajk + Cjk- 
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In the following result, the assumption is that the same bases are used for 
all three linear maps S + T, S, and T. 


3.36 The matrix of the sum of linear maps 
Suppose S$, T € L(V, W). Then M(S + T) = M(S) + M(T). 


The verification of the result above is left to the reader. 

Still assuming that we have some bases in mind, is the matrix of a scalar 
times a linear map equal to the scalar times the matrix of the linear map? 
Again the question does not make sense, because we have not defined scalar 
multiplication on matrices. Fortunately, the obvious definition again has the 
right properties. 


3.37 Definition scalar multiplication of a matrix 


The product of a scalar and a matrix is the matrix obtained by multiplying 
each entry in the matrix by the scalar: 


Aja Gos Allm AAT e AAI 
Ay: = etl : 
Ami ose Ama NAmi ... *Ann 
In other words, (AA) 4 = AA; x. 


In the following result, the assumption is that the same bases are used for 
both linear maps AT and T. 


3.38 The matrix of a scalar times a linear map 
Suppose A € F and T € L(V, W). Then MAT) = AM(T). 


The verification of the result above is also left to the reader. 

Because addition and scalar multiplication have now been defined for 
matrices, you should not be surprised that a vector space is about to appear. 
We need only a bit of notation so that this new vector space has a name. 


3.39 Notation F”” 


For m and n positive integers, the set of all m-by-n matrices with entries 
in F is denoted by F”. 
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3.40 dimF”” = mn 


Suppose m and n are positive integers. With addition and scalar multipli- 
cation defined as above, F”:” is a vector space with dimension mn. 


Proof The verification that F”:” is a vector space is left to the reader. Note 
that the additive identity of F”-” is the m-by-n matrix whose entries all 
equal 0. 

The reader should also verify that the list of m-by-n matrices that have 0 
in all entries except for a 1 in one entry is a basis of F”:”. There are mn such 
matrices, so the dimension of F”:” equals mn. o 


Matrix Multiplication 


Suppose, as previously, that v1,...,Vn is a basis of V and w1,...,Wm is 
a basis of W. Suppose also that we have another vector space U and that 
Uj,...,Up is a basis of U. 


Consider linear maps T: U —> V and S: V —> W. The composition 
ST is a linear map from U to W. Does M(ST) equal M(S)M(T)? This 
question does not yet make sense, because we have not defined the product of 
two matrices. We will choose a definition of matrix multiplication that forces 
this question to have a positive answer. Let’s see how to do this. 

Suppose M(S) = A and M(T) = C. For 1 < k < p, we have 


(ST)uz = S(_ Creve) 


r=1 


Thus M(ST) is the m-by-p matrix whose entry in row j, column k, equals 


n 
> Ajr Crk: 
r=1 
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Now we see how to define matrix multiplication so that the desired equation 


M(ST) = M(S)M(T) holds. 


3.41 Definition matrix multiplication 


Suppose A is an m-by-n matrix and C is an n-by- p matrix. Then AC is 
defined to be the m-by- p matrix whose entry in row j, column k, is given 
by the following equation: 


n 
(AC) jn = A jrCyk- 
r=1 
In other words, the entry in row j, column k, of AC is computed by 
taking row j of A and column k of C, multiplying together corresponding 
entries, and then summing. 


Note that we define the product of [ygu may have learned this defini- 
two matrices only when the number of | tion of matrix multiplication in an 
columns of the first matrix equals the | earlier course, although you may 


number of rows of the second matrix. not have seen the motivation for it. 
I 


3.42 Example Here we multiply together a 3-by-2 matrix and a 2-by-4 
matrix, obtaining a 3-by-4 matrix: 


1 2 10 7 4 1 
3 4 ee. 26 19 12 5 
5 6 42 31 20 9 


Matrix multiplication is not commutative. In other words, AC is not 
necessarily equal to CA even if both products are defined (see Exercise 12). 
Matrix multiplication is distributive and associative (see Exercises 13 and 14). 

In the following result, the assumption is that the same basis of V is used 
in considering T € L(U,V) and S € L(V, W), the same basis of W is used 
in considering S € L(V, W) and ST € L(U, W), and the same basis of U is 
used in considering T € L(U, V) and ST € L(U, W). 


3.43 The matrix of the product of linear maps 
If T € L(U,V) and S € L(V, W), then M(ST) = M(S)M(T). 


The proof of the result above is the calculation that was done as motivation 
before the definition of matrix multiplication. 
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In the next piece of notation, note that as usual the first index refers to a 
row and the second index refers to a column, with a vertically centered dot 
used as a placeholder. 


3.44 Notation A;., 4.x 


Suppose A is an m-by-n matrix. 


e If 1 < j < m, then Aj, denotes the 1-by-n matrix consisting of 
row j of A. 


e If 1 < k <n, then A. g denotes the m-by-1 matrix consisting of 
column k of A. 


8 4 5 
19 7 
column 2 of A. In other words, 


Az, =(1 9 7) and Ane 5) 


The product of a 1-by-n matrix and an n-by-1 matrix is a 1-by-1 matrix. 
However, we will frequently identify a 1-by-1 matrix with its entry. 


3.45 Example If A= ( ). then A2,. is row 2 of A and A. 2 is 


6 


3.46 Example (3 4) ( 5 


) = (26 ) because 3-6 + 4.2 = 26 


However, we can identify ( 26 ) with 26, writing ( 3 4 ) ( : ) = 26. 


Our next result gives another way to think of matrix multiplication: the 
entry in row j, column k, of AC equals (row j of A) times (column k of C). 


3.47 Entry of matrix product equals row times column 


Suppose A is an m-by-n matrix and C is an n-by- p matrix. Then 
(AC) =A, C. k 
for! < 7 <mand1<k <p. 
The proof of the result above follows immediately from the definitions. 


3.48 Example The result above and Example 3.46 show why the entry 
in row 2, column 1, of the product in Example 3.42 equals 26. 
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The next result gives yet another way to think of matrix multiplication. It 
states that column k of AC equals A times column k of C. 


3.49 Column of matrix product equals matrix times column 

Suppose A is an m-by-n matrix and C is an n-by-p matrix. Then 
(AC). ~ = AC. x 

forl <k < p. 


Again, the proof of the result above follows immediately from the defini- 
tions and is left to the reader. 


3.50 Example From the result above and the equation 
1 2 7 
2 «\(2)=( 2), 
5 6 31 
we see why column 2 in the matrix product in Example 3.42 is the right side 
of the equation above. 


We give one more way of thinking about the product of an m-by-n matrix 
and an n-by-1 matrix. The following example illustrates this approach. 


3.51 Example Inthe example above, the product of a 3-by-2 matrix and 
a 2-by-1 matrix is a linear combination of the columns of the 3-by-2 matrix, 
with the scalars that multiply the columns coming from the 2-by-1 matrix. 


Specifically, 7 | 2 
19 }=5{ 3 J+1] 4 
31 3 6 


The next result generalizes the example above. Again, the proof follows 
easily from the definitions and is left to the reader. 


3.52 Linear combination of columns 
c1 
Suppose A is an m-by-n matrix and c = : is an n-by-1 matrix. 


Cn 
Then 
ING = Chala gl ap 882 ap Gatk re 


In other words, Ac is a linear combination of the columns of A, with the 
scalars that multiply the columns coming from c. 


78 CHAPTER 3 Linear Maps 


Two more ways to think about matrix multiplication are given by Exercises 
10 and 11. 


EXERCISES 3.C 


1 Suppose V and W are finite-dimensional and T € L(V, W). Show that 
with respect to each choice of bases of V and W, the matrix of T has at 
least dim range T nonzero entries. 


2 Suppose D € L(P3 (R), P2 (R)) is the differentiation map defined by 
Dp = p’. Find a basis of P3(R) and a basis of P2(R) such that the 
matrix of D with respect to these bases is 


= OO 


1 0 0 
0 1 0 
0 0 0 
[Compare the exercise above to Example 3.34. 

The next exercise generalizes the exercise above. | 


3 Suppose V and W are finite-dimensional and T € L(V,W). Prove 
that there exist a basis of V and a basis of W such that with respect to 
these bases, all entries of M (T) are 0 except that the entries in row j, 
column j, equal 1 for 1 < j < dim range T. 


4 Suppose v1, ...,Vm is a basis of V and W is finite-dimensional. Suppose 
T € L(V, W). Prove that there exists a basis w1,..., Wn of W such that 
all the entries in the first column of M(T) (with respect to the bases 
V1,..-,Vm and w1,...,Wy) are 0 except for possibly a 1 in the first row, 
first column. 

Un this exercise, unlike Exercise 3, you are given the basis of V instead 
of being able to choose a basis of V.] 


5 Suppose w1,..., Wy isa basis of W and V is finite-dimensional. Suppose 
T €e L(V,W). Prove that there exists a basis vy,...,V¥m of V such 
that all the entries in the first row of M(T) (with respect to the bases 
Viewed as Vm and w1, ..., wn) are 0 except for possibly a 1 in the first row, 
first column. 

[In this exercise, unlike Exercise 3, you are given the basis of W instead 
of being able to choose a basis of W.] 


10 


11 


12 


13 


14 


15 
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Suppose V and W are finite-dimensional and T € L(V, W). Prove that 
dim range T = 1 if and only if there exist a basis of V and a basis of W 
such that with respect to these bases, all entries of M(T) equal 1. 


Verify 3.36. 

Verify 3.38. 

Prove 3.52. 

Suppose A is an m-by-n matrix and C is an n-by-p matrix. Prove that 
(AC) ;. = Aj, C 


for 1 < j < m. In other words, show that row j of AC equals 
(row j of A) times C. 


Suppose a = ( ay c+ an ) is a 1-by-n matrix and C is an n-by-p 
matrix. Prove that 


aC = a1C1, +++ + anCn, . 


In other words, show that aC is a linear combination of the rows of C, 
with the scalars that multiply the rows coming from a. 


Give an example with 2-by-2 matrices to show that matrix multiplication 
is not commutative. In other words, find 2-by-2 matrices A and C such 
that AC # CA. 


Prove that the distributive property holds for matrix addition and matrix 
multiplication. In other words, suppose A, B, C, D, E, and F are 
matrices whose sizes are such that A(B + C) and (D + E)F make 
sense. Prove that AB + AC and DF + EF both make sense and that 
A(B + C) = AB + AC and (D + E)F = DF + EF. 


Prove that matrix multiplication is associative. In other words, suppose 
A, B, and C are matrices whose sizes are such that (AB)C makes sense. 
Prove that A(BC) makes sense and that (AB)C = A(BC). 


Suppose A is an n-by-n matrix and 1 < j,k < n. Show that the entry in 
row j, column k, of A? (which is defined to mean AAA) is 


n n 
> $ AjpApr Ane 


p=\1r=1 
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3.D | Invertibility and Isomorphic Vector 
Spaces 


Invertible Linear Maps 


We begin this section by defining the notions of invertible and inverse in the 
context of linear maps. 


3.53 Definition invertible, inverse 


e A linear map T € L(V, W) is called invertible if there exists a 
linear map S € L(W, V) such that ST equals the identity map on 
V and TS equals the identity map on W. 


e A linear map S € L(W,V) satisfying ST = J and TS = I is 
called an inverse of T (note that the first Z is the identity map on V 
and the second 7 is the identity map on W). 


3.54 Inverse is unique 


An invertible linear map has a unique inverse. 


Proof Suppose T € L(V, W) is invertible and S$; and Sz are inverses of T. 
Then 
Sy = Sil = 8S1(TS2) = (S1T)S2 = TS2 = S2. 


Thus Sy = S2. E 
Now that we know that the inverse is unique, we can give it a notation. 


3.55 Notation T~! 


If T is invertible, then its inverse is denoted by T-t. In other words, if 
T € L(V, W) is invertible, then T7! is the unique element of L(W, V) 
such that TIT = I and TTT! = J. 


The following result characterizes the invertible linear maps. 


3.56 Invertibility is equivalent to injectivity and surjectivity 


A linear map is invertible if and only if it is injective and surjective. 
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Proof Suppose T € L(V, W). We need to show that T is invertible if and 
only if it is injective and surjective. 

First suppose T is invertible. To show that T is injective, suppose u,v € V 
and Tu = Tv. Then 


u=T '(Tu)=T \(Tv) =v, 


so u = v. Hence T is injective. 

We are still assuming that 7 is invertible. Now we want to prove that T is 
surjective. To do this, let w € W. Then w = T(T~!w), which shows that w is 
in the range of 7. Thus range T = W. Hence T is surjective, completing this 
direction of the proof. 

Now suppose T is injective and surjective. We want to prove that T is 
invertible. For each w € W, define Sw to be the unique element of V such 
that T(Sw) = w (the existence and uniqueness of such an element follow 
from the surjectivity and injectivity of T). Clearly T o S equals the identity 
map on W. 

To prove that S o T equals the identity map on V, let v € V. Then 


T((S o T)v) = (T o S)(Tv) = (Tv) = Tv. 


This equation implies that (S o T)v = v (because T is injective). Thus S o T 
equals the identity map on V. 

To complete the proof, we need to show that S is linear. To do this, suppose 
w1, w2 E€ W. Then 


T(Swı + Sw2) = T(Sw1) + T(Sw2) = wy + wo. 
Thus Sw, + Sw2 is the unique element of V that T maps to w1 + w2. By 
the definition of S, this implies that S(w1 + w2) = Swi + Sw2. Hence S 
satisfies the additive property required for linearity. 
The proof of homogeneity is similar. Specifically, if w € W and À € F, 
then 
T(ASw) = AT(Sw) = Aw. 
Thus A Sw is the unique element of V that T maps to Aw. By the definition of 
S, this implies that S(Aw) = ASw. Hence S is linear, as desired. m 


3.57 Example linear maps that are not invertible 


e The multiplication by x? linear map from P(R) to P(R) (see 3.4) is 
not invertible because it is not surjective (1 is not in the range). 


e The backward shift linear map from F” to F” (see 3.4) is not invertible 
because it is not injective [(1,0,0,0,...) is in the null space]. 
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Isomorphic Vector Spaces 


The next definition captures the idea of two vector spaces that are essentially 
the same, except for the names of the elements of the vector spaces. 


3.58 Definition isomorphism, isomorphic 


e An isomorphism is an invertible linear map. 


e Two vector spaces are called isomorphic if there is an isomorphism 
from one vector space onto the other one. 


Think of an isomorphism T : V — W as relabeling v € V as Tv € W. This 
viewpoint explains why two isomorphic vector spaces have the same vector 
space properties. The terms “isomorphism” and “invertible linear map” mean 
The Greek word isos means equal; ț the same thing. Use “isomorphism" 
the Greek word morph means when you want to emphasize that the 
shape. Thus isomorphic literally two spaces are essentially the same. 
means equal shape. 


3.59 Dimension shows whether vector spaces are isomorphic 


Two finite-dimensional vector spaces over F are isomorphic if and only if 
they have the same dimension. 


Proof First suppose V and W are isomorphic finite-dimensional vector 
spaces. Thus there exists an isomorphism T from V onto W. Because T is 
invertible, we have null T = {0} and range T = W. Thus dimnull T = 0 
and dimrange T = dim W. The formula 


dim V = dimnull T + dimrange T 


(the Fundamental Theorem of Linear Maps, which is 3.22) thus becomes the 
equation dim V = dim W, completing the proof in one direction. 

To prove the other direction, suppose V and W are finite-dimensional 
vector spaces with the same dimension. Let vj,..., Vv, be a basis of V and 
W1,...,Wn bea basis of W. Let T € L(V, W) be defined by 


T (civi +++: + Cnn) = ciwi +*+: + CaWn. 


Then T is a well-defined linear map because v1,...,vn is a basis of V 
(see 3.5). Also, T is surjective because w1,...,w», spans W. Furthermore, 
null T = {0} because w1,...,W» is linearly independent; thus T is injective. 
Because T is injective and surjective, it is an isomorphism (see 3.56). Hence 
V and W are isomorphic, as desired. E 
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The previous result implies that each 
finite-dimensional vector space V is iso- 
morphic to F”, where n = dim V. 

If vi,...,Vn is a basis of V and 
W1,...,Wm 1s a basis of W, then for 
each T € L(V,W), we have a matrix 
M(T) € F””. In other words, once 
bases have been fixed for V and W, 
M becomes a function from L(V, W) 
to F”>”. Notice that 3.36 and 3.38 show 
that M is a linear map. This linear map 
is actually invertible, as we now show. 
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Because every finite-dimensional 
vector space is isomorphic to some 
F”, why not just study F” instead of 
more general vector spaces? To an- 
swer this question, note that an in- 
vestigation of F” would soon lead 
to other vector spaces. For exam- 
ple, we would encounter the null 
space and range of linear maps. Al- 
though each of these vector spaces 
is isomorphic to some F”, thinking 
of them that way often adds com- 
plexity but no new insight. 


3.60 L(V,W) and F”:” are isomorphic 


Suppose v1,.. 


Proof 


.,Vn is a basis of V and wy,. 


..,Wm 18 a basis of W. 
Then M is an isomorphism between L(V, W) and F””. 


We already noted that M is linear. We need to prove that M is injec- 


tive and surjective. Both are easy. We begin with injectivity. If T € L(V, W) 


and M(T) = 0, then Tv, = 0 for k = 1,...,n. Because vj,. 


basis of V, this implies T = 0. Thus M is injective (by 3.16). 
To prove that M is surjective, suppose A € F”:”. Let T be the linear map 


from V to W such that 


m 


Tv, = > AjKWj 


j=l 


.., Vn 18a 


for k = 1,...,n (see 3.5). Obviously M (T) equals A, and thus the range of 


M equals F”:”, as desired. 


Now we can determine the dimension of the vector space of linear maps 
from one finite-dimensional vector space to another. 


3.61 


Suppose V and W are finite-dimensional. 


dimensional and 


dim L(V, W) = (dim V) (dim W) 


dim L(V, W) = (dim V) (dim W). 


Proof 


This follows from 3.60, 3.59, and 3.40. 


Then L(V, W) is finite- 
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Linear Maps Thought of as Matrix Multiplication 


Previously we defined the matrix of a linear map. Now we define the matrix 
of a vector. 


3.62 Definition matrix of a vector, M(v) 


Suppose v € V and vj,...,V, is a basis of V. The matrix of v with 
respect to this basis is the n-by-1 matrix 


Cl 
Miv)y=] : |. 
Cn 
where cj,..., Cy are the scalars such that 


Y= CUL +::: + CyVn. 


The matrix M (v) of a vector v € V depends on the basis v1,..., Vn of V, 
as well as on v. However, the basis should be clear from the context and thus 
it is not included in the notation. 


3.63 Example matrix of a vector 
e The matrix of 2—7x + 5x? with respect to the standard basis of P3(R) 
is 


= 
0 
J 


e The matrix of a vector x € F” with respect to the standard basis is 
obtained by writing the coordinates of x as the entries in an n-by-1 


matrix. In other words, if x = (x1, ..., Xn) € F”, then 
X1 
M(x) = 
Xn 


Occasionally we want to think of elements of V as relabeled to be n-by-1 
matrices. Once a basis v1, ..., Vn is chosen, the function M that takes v € V 
to M (v) is an isomorphism of V onto F”*! that implements this relabeling. 
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Recall that if A is an m-by-n matrix, then A. g denotes the k column of 
A, thought of as an m-by-1 matrix. In the next result, M (vg) is computed 
with respect to the basis w1,...,Wm of W. 


3.64 M(T). 7, = Mp). 


Suppose T € L(V, W) and v1,...,Vn is a basis of V and w1, ..., Wm is 
a basis of W. Let 1 < k < n. Then the k" column of M(T), which is 
denoted by M (T). ķ, equals M (vx). 


Proof The desired result follows immediately from the definitions of M(T) 
and M (vg). a 


The next result shows how the notions of the matrix of a linear map, the 
matrix of a vector, and matrix multiplication fit together. 
3.65 Linear maps act like matrix multiplication 


Suppose T € L(V, W) and v € V. Suppose v1,..., Vy, is a basis of V and 
W1,..-,Wm 18 a basis of W. Then 


M(Tv) = M(T)M(). 


Proof Suppose v = cjvy +: + CnVn, where c1,...,Cn € F. Thus 
3.66 Tv=cyTvy +++ tcnT vp. 
Hence 


M(Tv) = c1 M(Tv1) + +++ + nM (Tn) 
= aM(T).34 ++ CnM(T)..n 
= M(T)M(y), 


where the first equality follows from 3.66 and the linearity of M, the second 
equality comes from 3.64, and the last equality comes from 3.52. m 


Each m-by-n matrix A induces a linear map from F”! to F™»1, namely the 
matrix multiplication function that takes x € F”! to Ax € F”!. The result 
above can be used to think of every linear map (from one finite-dimensional 
vector space to another finite-dimensional vector space) as a matrix multi- 
plication map after suitable relabeling via the isomorphisms given by M. 
Specifically, if T € L(V, W) and we identify v € V with M(v) € F”!, then 
the result above says that we can identify Tv with M(T)M (v). 
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Because the result above allows us to think (via isomorphisms) of each 
linear map as multiplication on F”! by some matrix A, keep in mind that the 
specific matrix A depends not only on the linear map but also on the choice 
of bases. One of the themes of many of the most important results in later 
chapters will be the choice of a basis that makes the matrix A as simple as 
possible. 

In this book, we concentrate on linear maps rather than on matrices. How- 
ever, sometimes thinking of linear maps as matrices (or thinking of matrices 
as linear maps) gives important insights that we will find useful. 


Operators 


Linear maps from a vector space to itself are so important that they get a 
special name and special notation. 


3.67 Definition operator, L(V) 
e A linear map from a vector space to itself is called an operator. 


e The notation L(V) denotes the set of all operators on V. In other 
words, L(V) = L(V, V). 


The deepest and most important | A linear map is invertible if it is 
parts of linear algebra, as well as injective and surjective. For an op- 
most of the rest of this book, deal erator, you might wonder whether in- 
with operators. jectivity alone, or surjectivity alone, 


is enough to imply invertibility. On 
infinite-dimensional vector spaces, neither condition alone implies invert- 
ibility, as illustrated by the next example, which uses two familiar operators 
from Example 3.4. 


3.68 Example neither injectivity nor surjectivity implies invertibility 
e The multiplication by x? operator on P (R) is injective but not surjective. 
e The backward shift operator on F° is surjective but not injective. 


In view of the example above, the next result is remarkable—it states 
that for operators on a finite-dimensional vector space, either injectivity or 
surjectivity alone implies the other condition. Often it is easier to check that 
an operator on a finite-dimensional vector space is injective, and then we get 
surjectivity for free. 
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3.69 Injectivity is equivalent to surjectivity in finite dimensions 


Suppose V is finite-dimensional and T € L(V). Then the following are 
equivalent: 


(a) T is invertible; 
(b) T is injective; 


(c) T is surjective. 


Proof Clearly (a) implies (b). 
Now suppose (b) holds, so that T is injective. Thus null T = {0} (by 3.16). 
From the Fundamental Theorem of Linear Maps (3.22) we have 


dim range T = dim V — dimnull T 
= dim V. 


Thus range T equals V. Thus T is surjective. Hence (b) implies (c). 
Now suppose (c) holds, so that T is surjective. Thus range T = V. From 
the Fundamental Theorem of Linear Maps (3.22) we have 


dim null T = dim V — dim range T 
= 0. 


Thus null 7 equals {0}. Thus T is injective (by 3.16), and so T is invertible 
(we already knew that T was surjective). Hence (c) implies (a), completing 
the proof. m 


The next example illustrates the power of the previous result. Although 
it is possible to prove the result in the example below without using linear 
algebra, the proof using linear algebra is cleaner and easier. 


3.70 Example Show that for each polynomial q € P (R), there exists a 
polynomial p € P(R) with (œ? + 5x + 7p)” =q. 


Solution Example 3.68 shows that the magic of 3.69 does not apply to the 
infinite-dimensional vector space P(R). However, each nonzero polynomial 
q has some degree m. By restricting attention to Pm(R), we can work with a 
finite-dimensional vector space. 

Suppose q € Pm(R). Define T : Pm (R) > Pm (R) by 


Tp = (x? + 5x + Dp)”. 
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Multiplying a nonzero polynomial by (x? + 5x + 7) increases the degree 
by 2, and then differentiating twice reduces the degree by 2. Thus T is indeed 
an operator on Py, (R). 

Every polynomial whose second derivative equals 0 is of the form ax + b, 
where a,b € R. Thus null T = {0}. Hence T is injective. 

Now 3.69 implies that T is surjective. Thus there exists a polynomial 
p € Pm(R) such that (œ? + 5x + 7)p)" = q, as desired. 


Exercise 30 in Section 6.A gives a similar but more spectacular application 
of 3.69. The result in that exercise is quite difficult to prove without using 
linear algebra. 


EXERCISES 3.D 


1 Suppose T € L(U, V) and S$ € L(V, W) are both invertible linear maps. 
Prove that ST € L(U, W) is invertible and that (ST)~! = T~!S7}, 


2 Suppose V is finite-dimensional and dim V > 1. Prove that the set of 
noninvertible operators on V is not a subspace of L(V). 


3 Suppose V is finite-dimensional, U is a subspace of V, and S$ € £(U,V). 
Prove there exists an invertible operator T € L(V) such that Tu = Su 
for every u € U if and only if S is injective. 


4 Suppose W is finite-dimensional and T1, T2 € L(V,W). Prove that 
null T} = null T, if and only if there exists an invertible operator 
S € L(W) such that Ti = S T2. 


5 Suppose V is finite-dimensional and T1, T2 € £(V,W). Prove that 
range T} = range 72 if and only if there exists an invertible operator 
S € L(V) such that Ti = 72S. 


6 Suppose V and W are finite-dimensional and T1, T2 E€ L(V, W). Prove 
that there exist invertible operators R € L(V) and S € L(W) such that 
Tı = ST2R if and only if dim null 7; = dim null T2. 


7 Suppose V and W are finite-dimensional. Let v € V. Let 
E={T e L(V,W):Tv=0}. 


(a) Show that F is a subspace of L(V, W). 
(b) Suppose v 4 0. What is dim E? 
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Suppose V is finite-dimensional and T : V — W is a surjective linear 
map of V onto W. Prove that there is a subspace U of V such that 
T |y is an isomorphism of U onto W. (Here T |y means the function T 
restricted to U. In other words, T |y is the function whose domain is U, 
with T|y defined by T |y (u) = Tu for every u € U.) 


Suppose V is finite-dimensional and S, T € L(V). Prove that ST is 
invertible if and only if both S and T are invertible. 


Suppose V is finite-dimensional and S$, T € L(V). Prove that ST = I 
if and only if TS = I. 


Suppose V is finite-dimensional and $,7,U € L(V) and STU = I. 
Show that T is invertible and that T~! = US. 


Show that the result in the previous exercise can fail without the hypoth- 
esis that V is finite-dimensional. 


Suppose V is a finite-dimensional vector space and R, S,T € L(V) are 
such that RST is surjective. Prove that S is injective. 


Suppose v1,..., Vy is a basis of V. Prove that the map T: V > F”! 
defined by 

Tv = M() 
is an isomorphism of V onto F”*!; here M(v) is the matrix of v € V 


with respect to the basis vj,..., Vy. 


Prove that every linear map from F”*! to F”! is given by a matrix 
multiplication. In other words, prove that if T € L(F*:1, F™-1), then 
there exists an m-by-n matrix A such that Tx = Ax for every x € F”!. 


Suppose V is finite-dimensional and T € L(V). Prove that T is a scalar 
multiple of the identity if and only if ST = TS for every S € L(V). 


Suppose V is finite-dimensional and € is a subspace of L(V) such that 
ST € E€ and TS € € forall S e L(V) and all T € E. Prove that 
E = {0} or E = L(V). 


Show that V and £(F, V) are isomorphic vector spaces. 


Suppose T € L(PR)) is such that T is injective and deg Tp < deg p 
for every nonzero polynomial p € P(R). 


(a) Prove that T is surjective. 


(b) Prove that deg Tp = deg p for every nonzero p € P(R). 
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20 Suppose n is a positive integer and A;,; € F fori, j = 1,...,n. Prove 


that the following are equivalent (note that in both parts below, the 
number of equations equals the number of variables): 


(a) The trivial solution x; = --- = x, = 0 is the only solution to the 
homogeneous system of equations 


n 
3 AL exe = 0 
k=1 


n 
2 An,kXk =0. 
k=1 


(b) For every c1,...,Cn € F, there exists a solution to the system of 
equations 


n 
>, Ai ,kXk = C1 
k=1 


n 
5 An,kXk = Cn. 
k=1 
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3.E Products and Quotients of Vector Spaces 


Products of Vector Spaces 


As usual when dealing with more than one vector space, all the vector spaces 
in use should be over the same field. 


3.71 Definition product of vector spaces 
Suppose V1, ..., Vm are vector spaces over F. 
e The product V; x --- x Vm is defined by 


Vi X-X Vin = (1, ---. Vm) : V1 E Vi,- Vm E Vin}. 


e Addition on V; x --- x Vm is defined by 
(u1,...,Um) + (11,---,Vm) = (u1 SF Wiganag Va + Vm). 


e Scalar multiplication on Vj X --+ x Vm is defined by 


Aase an hi) = Oi oana Aim) 


3.72 Example Elements of P2 (R) x RÌ are lists of length 2, with the 
first item in the list an element of P2 (R) and the second item in the list an 
element of R?. 


For example, (5 — 6x + 4x?, (3,8, 7)) € P2(R) x R?. 


The next result should be interpreted to mean that the product of vector 
spaces is a vector space with the operations of addition and scalar multiplica- 
tion as defined above. 


3.73 Product of vector spaces is a vector space 


Suppose V;,..., Vm are vector spaces over F. Then Vj x --- x Vm is a 
vector space over F. 


The proof of the result above is left to the reader. Note that the additive 
identity of Vj x --- x Vm is (0,..., 0), where the O in the j th slot is the 
additive identity of V;. The additive inverse of (v1,....Vm) € Vi X +++ X Vin 


is (—v1,...,—Vm). 
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3.74 Example Is R? x R? equal to R°? Is R? x R? isomorphic to Rê? 


Solution Elements of R? x R? are lists (x41 , X2), (X3, X4, x5)), where 
X1, X2, X3, X4, X5 E R. 

Elements of R? are lists (x1, X2, X3, X4, X5), where x1, X2, X3, X4, X5 ER. 

Although these look almost the same, they are not the same kind of object. 
Elements of R? x R? are lists of length 2 (with the first item itself a list of 
length 2 and the second item a list of length 3), and elements of RÊ are lists 
of length 5. Thus R? x R? does not equal R. 

The linear map that takes a vector ((x1, X2), (X3, X4, x5)) e R? x R? to 
(x1, X2, X3, X4, X5) € R? is clearly an isomorphism of R? x R? onto R°. 
Thus these two vector spaces are isomorphic. 

In this case, the isomorphism is so natural that we should think of it as a 
relabeling. Some people would even informally say that R? x R? equals R5, 
which is not technically correct but which captures the spirit of identification 
via relabeling. 


The next example illustrates the idea of the proof of 3.76. 


3.75 Example Find a basis of P2 (R) x R?. 
Solution Consider this list of length 5 of elements of P2 (R) x R?: 


(1, (0, 0)), (x, (0, 0)), (x7, (0, 0)), (0, (1, 0)), (0, (0, 1)). 


The list above is linearly independent and it spans P2 (R) x R?. Thus it is a 
basis of P2 (R) x R?. 


3.76 Dimension of a product is the sum of dimensions 


Suppose V1,..., Vm are finite-dimensional vector spaces. Then 
Vi X --- X Vm is finite-dimensional and 


dim(V; x --- x Vm) = dim Vj +--+ + dim Vm. 


Proof Choose a basis of each V;. For each basis vector of each V;, consider 
the element of Vı x --- x Vm that equals the basis vector in the j™ slot and 
0 in the other slots. The list of all such vectors is linearly independent and 
spans V; x ++- X Vm. Thus it is a basis of Vj x --- x Vm. The length of this 
basis is dim V; + --- + dim Vj, as desired. E 
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Products and Direct Sums 


In the next result, the map I is surjective by the definition of U1 + +--+ Um. 
Thus the last word in the result below could be changed from “injective” to 
“invertible”. 


3.77 Products and direct sums 


Suppose that U;,...,Uj are subspaces of V. Define a linear map 
T : U1 X- xX Um > U1 +--+ Um by 


T (u1,..., Um) = Uy +: + um. 


Then U1 +--+ + Um is a direct sum if and only if I is injective. 
Proof The linear map I is injective if and only if the only way to write 0 as a 
sum uy +: + Um, Where each u; is in U;, is by taking each u; equal to 0. 


Thus 1.44 shows that T is injective if and only if U1 + -++ + Um is a direct 
sum, as desired. E 


3.78 A sum is a direct sum if and only if dimensions add up 


Suppose V is finite-dimensional and U4, ..., Um are subspaces of V. Then 
U1 +--+- + Um is a direct sum if and only if 


dim(U, +--+ + Um) = dim U1 + --- + dim Um. 


Proof The map T in 3.77 is surjective. Thus by the Fundamental Theorem 
of Linear Maps (3.22), T is injective if and only if 
dim(U; +---+ Um) = dim(U, x --- x Um). 


Combining 3.77 and 3.76 now shows that U; + --- + Um is a direct sum if 
and only if 


dim(U; +---+ Um) = dim U1 + ---+ dim Um, 
as desired. a 


In the special case m = 2, an alternative proof that U; + U2 is a direct 
sum if and only if dim(U; + U2) = dim U; + dim U2 can be obtained by 
combining 1.45 and 2.43. 
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Quotients of Vector Spaces 


We begin our approach to quotient spaces by defining the sum of a vector and 
a subspace. 


3.79 Definition v+ U 


Suppose v € V and U is a subspace of V. Then v + U is the subset of V 
defined by 
v+U={fv+u:iuet}. 


3.80 Example Suppose 
(10, 20) (17,20) 
U = {(x,2x) € R? : x € R}. 20 


Then U is the line in R? through the origin 
with slope 2. Thus 


(17,20) + U U (17, 20) + U 


is the line in R? that contains the point 
(17, 20) and has slope 2. 


3.81 Definition affine subset, parallel 


e Anaffine subset of V is a subset of V of the form v + U for some 
v € V and some subspace U of V. 


e Forv € V and U a subspace of V, the affine subset v + U is said to 
be parallel to U. 


3.82 Example parallel affine subsets 
e In Example 3.80 above, all the lines in R? with slope 2 are parallel to U. 


e If U = {(x,y,0) € R? : x,y € R}, then the affine subsets of R? 
parallel to U are the planes in R? that are parallel to the x y-plane U in 
the usual sense. 

Important: With the definition of parallel given in 3.81, no line in R? 
is considered to be an affine subset that is parallel to the plane U. 
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3.83 Definition quotient space, V/U 


Suppose U is a subspace of V. Then the quotient space V/U is the set of 
all affine subsets of V parallel to U. In other words, 


V/U ={v4+U:veV}. 


3.84 Example quotient spaces 
e If U = {(x,2x) € R? : x € R}, then R?/U is the set of all lines in 
R? that have slope 2. 


e If U isa line in R? containing the origin, then R3/U is the set of all 
lines in R? parallel to U. 


e If U isa plane in R? containing the origin, then R3/U is the set of all 
planes in R? parallel to U. 


Our next goal is to make V/U into a vector space. To do this, we will 
need the following result. 


3.85 Two affine subsets parallel to U are equal or disjoint 


Suppose U is a subspace of V and v,w € V. Then the following are 
equivalent: 


(a) v—-weU,; 
b v+U=w+U; 
(c) (V+U)N(w+t+U)F4@. 


Proof First suppose (a) holds, so v — w € U. If u € U, then 
v+u=wt+(v—w)t+u)ewt+U. 

Thus v + U C w + U. Similarly, w + U C v + U. Thus v+ U = w + U, 
completing the proof that (a) implies (b). 

Obviously (b) implies (c). 

Now suppose (c) holds, so (v + U) N (w + U) # @. Thus there exist 
u1, u2 € U such that 

v+ui =w+u2. 

Thus v — w = uz — u1. Hence v — w € U, showing that (c) implies (a) and 
completing the proof. 7 


96 CHAPTER 3 Linear Maps 
Now we can define addition and scalar multiplication on V/U. 


3.86 Definition addition and scalar multiplication on V/U 


Suppose U is a subspace of V. Then addition and scalar multiplication 
are defined on V/U by 


v+U)+(w+U)=(W4+w)+U 
A(Wv+U) = (Av) +U 


forv,w e V andà EF. 


As part of the proof of the next result, we will show that the definitions 
above make sense. 


3.87 Quotient space is a vector space 


Suppose U is a subspace of V. Then V/U, with the operations of addition 
and scalar multiplication as defined above, is a vector space. 


Proof The potential problem with the definitions above of addition and scalar 
multiplication on V/U is that the representation of an affine subset parallel to 
U is not unique. Specifically, suppose v, w € V. Suppose also that >, w € V 
are such that v+ U = p+ U andw + U = w+4U. To show that the 
definition of addition on V/U given above makes sense, we must show that 
V+tw)+U=(4w)4+U. 

By 3.85, we have 


v—vpeU and w-—weu. 


Because U is a subspace of V and thus is closed under addition, this implies 
that (v — >) + (w—w) € U. Thus (v + w) — (0) + w) € U. Using 3.85 again, 
we see that 

(vV+w)+U =(04+w)+U, 


as desired. Thus the definition of addition on V/ U makes sense. 

Similarly, suppose A € F. Because U is a subspace of V and thus is 
closed under scalar multiplication, we have A(v — Ŷ) € U. Thus Av — Ad € U. 
Hence 3.85 implies that (Av) + U = (Av) + U. Thus the definition of scalar 
multiplication on V/U makes sense. 

Now that addition and scalar multiplication have been defined on V/ U, the 
verification that these operations make V/U into a vector space is straightfor- 
ward and is left to the reader. Note that the additive identity of V/U is 0 + U 
(which equals U) and that the additive inverse of v + U is (—v) + U. E 
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The next concept will give us an easy way to compute the dimension 
of V/U. 


3.88 Definition quotient map, x 


Suppose U is a subspace of V. The quotient map v is the linear map 
x: V > V/U defined by 


m(v) =v+U 
forve V. 
The reader should verify that m is indeed a linear map. Although z 


depends on U as well as V, these spaces are left out of the notation because 
they should be clear from the context. 


3.89 Dimension of a quotient space 


Suppose V is finite-dimensional and U is a subspace of V. Then 


dim V/U = dim V — dim U. 


Proof Let z be the quotient map from V to V/U. From 3.85, we see that 
nullz = U. Clearly range m = V/U. The Fundamental Theorem of Linear 
Maps (3.22) thus tells us that 


dim V = dim U + dimV/U, 
which gives the desired result. 7 


Each linear map T on V induces a linear map T on V/(null T), which we 
now define. 


3.90 Definition T 
Suppose T € L(V, W). Define T: V/(null T) > W by 
T(v + null T) = Tv. 


To show that the definition of T makes sense, suppose u,v € V are such 
that u + null7 = v + null7. By 3.85, we have u — v € null7. Thus 
T(u — v) = 0. Hence Tu = Tv. Thus the definition of T indeed makes 
sense. 
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3.91 Null space and range of T 
Suppose T € L(V, W). Then 


(a) Ť isa linear map from V/(null T) to W; 
(b) T is injective; 
(c) range T = range T; 


(d) V/(null 7) is isomorphic to range T. 


Proof 


(a) The routine verification that T is linear is left to the reader. 


(b) Suppose v € V and T (v+null T) = 0. Then Tv = 0. Thus v € null T. 
Hence 3.85 implies that v + null 7 = 0 + null T. This implies that 
null T = 0, and hence T is injective, as desired. 


(c) The definition of T shows that range T = range T. 


(d) Parts (b) and (c) imply that if we think of T as mapping into range T, 
then T is an isomorphism from V/(null T) onto range T. E 


EXERCISES 3.E 


1 Suppose T is a function from V to W. The graph of T is the subset of 
V x W defined by 


graph of T = {(v, Ty Ee V xW:veV}. 


Prove that T is a linear map if and only if the graph of T is a subspace 
of V x W. 

[Formally, a function T from V to W is a subset T of V x W such that 
for each v € V, there exists exactly one element (v,w) € T. In other 
words, formally a function is what is called above its graph. We do 
not usually think of functions in this formal manner. However, if we do 
become formal, then the exercise above could be rephrased as follows: 
Prove that a function T from V to W is a linear map if and only if T is 
a subspace of V x W.] 


10 


11 


12 
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Suppose V;,..., Vm are vector spaces such that Vj x --- x Vm is finite- 
dimensional. Prove that V}; is finite-dimensional for each j = 1,...,m. 


Give an example of a vector space V and subspaces U1, U2 of V such 
that U; x U2 is isomorphic to Uy + U2 but U1 + U2 is not a direct sum. 


Hint: The vector space V must be infinite-dimensional. 


Suppose Vj,..., Vm are vector spaces. Prove that L(V1 x --- x Vin, W) 
and L(V1, W) x --- x L(Vm, W) are isomorphic vector spaces. 


Suppose W1,..., Win are vector spaces. Prove that L(V, W1 x---* Wm) 
and L(V, W1) x --- x L(V, Wm) are isomorphic vector spaces. 


For n a positive integer, define V” by 
V7 =V xx V. 
—S — 
n times 


Prove that V” and £(F”, V) are isomorphic vector spaces. 


Suppose v, x are vectors in V and U, W are subspaces of V such that 
v+U = x + W. Prove that U = W. 


Prove that a nonempty subset A of V is an affine subset of V if and only 
if àv + (1 —A)w € A for all v,w € A and all À € F. 


Suppose A, and A2 are affine subsets of V. Prove that the intersection 
A, N A2 is either an affine subset of V or the empty set. 


Prove that the intersection of every collection of affine subsets of V is 
either an affine subset of V or the empty set. 


Suppose v1,..., Vm E V. Let 
A = {Aivi +: + ÀmYm : À1,..., Am E F and ay +--+ Àm = 1}. 


(a) Prove that A is an affine subset of V. 


(b) Prove that every affine subset of V that contains vj,..., Vm also 
contains A. 


(c) Prove that A = v + U for some v € V and some subspace U of 
V with dimU < m- 1. 


Suppose U is a subspace of V such that V/U is finite-dimensional. 
Prove that V is isomorphic to U x (V/U). 
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Suppose U is a subspace of V and vı + U,...,Vvm + U is a basis of 
V/U and u1,..., un is a basis of U. Prove that v1,...,Vm,U1,.--,Un 
is a basis of V. 


Suppose U = {(x1,X2,...) € F : x; Æ 0 for only finitely many j}. 


(a) | Show that U is a subspace of F°°. 


(b) Prove that F°°/U is infinite-dimensional. 


Suppose g € L(V,F) and g ¥ 0. Prove that dim V/(null g) = 1. 


Suppose U is a subspace of V such that dimV/U = 1. Prove that there 
exists g E€ L(V, F) such that null g = U. 


Suppose U is a subspace of V such that V/U is finite-dimensional. 
Prove that there exists a subspace W of V such that dim W = dim V/U 
andV =U @W. 


Suppose T € L(V, W) and U is a subspace of V. Let a denote the 
quotient map from V onto V/U. Prove that there exists S e L(V/U, W) 
such that T = S oz if and only if U C null T. 


Find a correct statement analogous to 3.78 that is applicable to finite 
sets, with unions analogous to sums of subspaces and disjoint unions 
analogous to direct sums. 


Suppose U is a subspace of V. Define T: L(V/U, W) > L(V, W) by 
T(S) = Sox. 


(a) Show that T is a linear map. 
(b) Show that T is injective. 
(c) Show that ranger = {T € L(V, W) : Tu = 0 for every u € U}. 
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3.F | Duality 


The Dual Space and the Dual Map 


Linear maps into the scalar field F play a special role in linear algebra, and 
thus they get a special name: 


3.92 Definition linear functional 


A linear functional on V is a linear map from V to F. In other words, a 
linear functional is an element of L(V, F). 


3.93 


Example linear functionals 


Define g: R3 > R by g(x, y, Z) = 4x — 5y + 2z. Then ọ is a linear 
functional on R?. 


Fix (c1,...,¢n) € F”. Define g: F” > F by 
(X1, -- Xn) = C1 Ry Po + CnXn. 
Then ¢ is a linear functional on F”. 


Define g: P(R) > R by (p) = 3p” (5) + 7p(4). Then ¢ is a linear 
functional on P(R). 


Define gy: P(R) > R by g(p) = i. p(x) dx. Then ¢ is a linear 
functional on P(R). 


The vector space L(V, F) also gets a special name and special notation: 


3.94 Definition dual space, V’ 


The dual space of V, denoted V’, is the vector space of all linear 
functionals on V. In other words, V’ = L(V,F). 


3.95 dimV’ = dim V 


Suppose V is finite-dimensional. Then V’ is also finite-dimensional and 
dim V’ = dim V. 


Proof 


This result follows from 3.61. a 
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In the following definition, 3.5 implies that each g; is well defined. 


3.96 Definition dual basis 


If vj,...,V, is a basis of V, then the dual basis of v1,..., Vy is the list 
Y1,- - -, Øn Of elements of V’, where each g j is the linear functional on V 
such that 
ap aie fe = Ff. 
Weipa = 
EN esis, 


3.97 Example What is the dual basis of the standard basis e1,..., en 
of F”? 


Solution For 1 < j < n, define g; to be the linear functional on F” that 
selects the j" coordinate of a vector in F”. In other words, 


G5 (X15 66. Xn) = Xj 
for (%1,...,Xn) € F”. Clearly 
1 a k= 7, 
0 if kÆj. 


Thus Q1, ..., Qn is the dual basis of the standard basis e1,..., en of F”. 


pj (ex) = 


The next result shows that the dual basis is indeed a basis. Thus the 
terminology “dual basis” is justified. 
3.98 Dual basis is a basis of the dual space 


Suppose V is finite-dimensional. Then the dual basis of a basis of V is a 
basis of V”. 


Proof Suppose vj,...,V, is a basis of V. Let g,..., @n denote the dual 
basis. 

To show that g1, ..., n is a linearly independent list of elements of V’, 
suppose 41,...,4an E F are such that 


a191 +++ + ann =Q. 


Now (a191 + + + angn)vj) = aj for j = 1,..., n. The equation 
above thus shows that aj = --- = an = 0. Hence ¢1,...,@, is linearly 
independent. 

Now 2.39 and 3.95 imply that g1,..., n is a basis of V”. E 
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In the definition below, note that if T is a linear map from V to W then T’ 
is a linear map from W” to V”. 


3.99 Definition dual map, T’ 


If T € L(V, W), then the dual map of T is the linear map T’ € L(W’, V^ 
defined by T’(y) = goT for g € W”. 


If T € L(V,W) and yg e W', then T'(ọ) is defined above to be the 
composition of the linear maps g and T. Thus 7’(¢) is indeed a linear map 
from V to F; in other words, T’(g) € V”. 

The verification that 7’ is a linear map from W” to V” is easy: 


e If, y € W’, then 
T'(o+y)=(+y)oT=gp0T+yy0oT =T" (p) + T'Y). 
e If à € F ando € W’, then 


T’(Ag) = (Ag) oT = (p o T) = AT' (9). 


In the next example, the prime notation is used with two unrelated mean- 
ings: D’ denotes the dual of a linear map D, and p’ denotes the derivative of 
a polynomial p. 


3.100 Example Define D: P(R) > P(R) by Dp = p’. 


e Suppose ¢ is the linear functional on P (R) defined by g(p) = p(3). 
Then D’(¢) is the linear functional on P(R) given by 


(D'(¢))(p) = (p ° D)(p) = (Dp) = (p') = p'(3). 
In other words, D’(q) is the linear functional on P(R) that takes p to 
p'(3). 


e Suppose ¢ is the linear functional on P(R) defined by y(p) = fo p. 
Then D’ (ọ) is the linear functional on P(R) given by 


1 
(D'(v))(p) = (oD)(p) = ¢(Dp) = 9(p’) = [ p’ = p(1)—p(0). 


In other words, D’(¢) is the linear functional on P(R) that takes p to 
p(t) — pO). 
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The first two bullet points in the result below imply that the function that 
takes T to T’ is a linear map from L(V, W) to L(W’, V’). 

In the third bullet point below, note the reversal of order from ST on the 
left to T'S’ on the right (here we assume that U is a vector space over F). 


3.101 Algebraic properties of dual maps 
e (S+T) = S' + T' forall S,T € L(V, W). 
e (ATY = AT’ for all A € F and all T € L(V, W). 
e (STY = T'S' forall T € L(U, V) and all S € L(V, W). 


Proof The proofs of the first two bullet points above are left to the reader. 
To prove the third bullet point, suppose œ € W’. Then 


(ST)'(g) = po(ST) = (poS)oT = T'(p0S) = T'(S'(g)) = (T'S), 


Some books use the noranon V* | where the first, third, and fourth equal- 
and T* for duality instead of V' ities above hold because of the defini- 
and T'. However, here we reserve | tion of the dual map, the second equality 
the notation T* for the adjoint, holds because composition of functions 
which will be introduced when we || is associative, and the last equality fol- 
study linear maps on inner product lows from the definition of composition. 
paises eg fd fs The equality of the first and last 


: terms above for all p € W” means that 
(STY = T'S. E 
The Null Space and Range of the Dual of a Linear Map 


Our goal in this subsection is to describe null T” and range T’ in terms of 
range T and null T. To do this, we will need the following definition. 


3.102 Definition annihilator, U? 


For U C V, the annihilator of U, denoted U?, is defined by 
U? = {ọ eV’: ou) = 0 for all u € U}. 


3.103 Example Suppose U is the subspace of P(R) consisting of all 
polynomial multiples of x?. If ọ is the linear functional on P(R) defined by 
o(p) = p' (0), then ọ € U?. 


SECTION 3.F Duality 105 


For U C V, the annihilator U? is a subset of the dual space V’. Thus U? 
depends on the vector space containing U, so a notation such as U 7 would be 
more precise. However, the containing vector space will always be clear from 
the context, so we will use the simpler notation U®. 


3.104 Example Let e1, e2, e3, e4, es denote the standard basis of R5, and 
let 91, 2. #3, P4, Ys denote the dual basis of (R° i Suppose 


U = span(e1, e2) = {(x1, X2,0,0,0) € R: x1, X2 € R}. 
Show that U? = span(93, 4, 95). 


Solution Recall (see 3.97) that ø; is the linear functional on R°? that selects 
that i coordinate: 9; (1, X2, X3, X4, X5) = Xj. 

First suppose o € span(@3, #4, #5). Then there exist c3, c4, c5 € R such 
that ¢ = c393 + capa + c5¢5. If (x1, x2, 0,0,0) € U, then 


p(x1, x2,0,0, 0) = (c393 + €494 + €595)(X1, x2, 0,0,0) = 0. 


Thus g € U 0. In other words, we have shown that span(93, 94,95) C U a 

To show the inclusion in the other direction, suppose g € U°. Because 
the dual basis is a basis of (RS), there exist c1, C2, ¢€3,C4,C5 € R such that 
P = C191 + C202 + C303 + C494 + C55. Because ey € U andy € U®, we 
have 


0 = g(e1) = (C191 + C2p2 + C33 + C494 + C595) (C1) = c1. 


Similarly, e2 € U and thus c2 = 0. Hence g = c393 + C494 + c55. Thus 
g € span(~3, ~4, #5), which shows that U? C span(93, 94, 95). 


3.105 The annihilator is a subspace 


Suppose U C V. Then U? is a subspace of V’. 


Proof Clearly 0 € U® (here 0 is the zero linear functional on V), because 
the zero linear functional applied to every vector in U is 0. 

Suppose g, w € U®. Thus g, y € V’ and y(u) = y(u) = 0 for every 
u € U. Ifu € U, then (pọ + W)(uv) = y(u) + y(u) = 0+ 0 = 0. Thus 
g+weu®. 

Similarly, U® is closed under scalar multiplication. Thus 1.34 implies that 
U? is a subspace of V”. m 
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The next result shows that dim U® is the difference of dim V and dim U. 
For example, this shows that if U is a 2-dimensional subspace of R°, then U? 
is a 3-dimensional subspace of (RS), as in Example 3.104. 

The next result can be proved following the pattern of Example 3.104: 
choose a basis u1,..., Um Of U, extend to a basis u1,..., um, ..., un Of V, 
let 91,- - - , Øm, - - - , Qn be the dual basis of V’, and then show @m+1,...,Qn 
is a basis of U®, which implies the desired result. 

You should construct the proof outlined in the paragraph above, even 
though a slicker proof is presented here. 


3.106 Dimension of the annihilator 


Suppose V is finite-dimensional and U is a subspace of V. Then 
dim U + dim U? = dim V. 


Proof Leti € L£(U, V) be the inclusion map defined by i (u) = u for u € U. 
Thus i’ is a linear map from V’ to U’. The Fundamental Theorem of Linear 
Maps (3.22) applied to i’ shows that 


dim range i’ + dimnulli’ = dim V’. 


However, null i’ = U? (as can be seen by thinking about the definitions) and 
dim V’ = dim V (by 3.95), so we can rewrite the equation above as 


dim range i’ + dim U? = dim V. 


If g € U’, then ọ can be extended to a linear functional y% on V (see, 
for example, Exercise 11 in Section 3.A). The definition of i’ shows that 
i'(w) = gy. Thus g € rangei’, which implies that rangei’ = U’. Hence 
dim range i’ = dim U’ = dim U, and the displayed equation above becomes 
the desired result. a 


The proof of part (a) of the result below does not use the hypothesis that 
V and W are finite-dimensional. 


3.107 The null space of T’ 
Suppose V and W are finite-dimensional and T € L(V, W). Then 
(a) null T’ = (range T)°; 


(b)  dimnull T’ = dim null T + dim W — dim V. 


Proof 


(a) 


(b) 
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First suppose gy € null T”. Thus 0 = T’(g) = ọ o T. Hence 
0=(yoT)(v) = (Tv)  foreveryv E V. 


Thus Q € (range T)?. This implies that null T’ C (range T). 


To prove the inclusion in the opposite direction, now suppose that 
o € (rangeT)°. Thus (Tv) = 0 for every vector v € V. Hence 
0=goT =T'(g). In other words, g € null T’, which shows that 
(range T)? C null T’, completing the proof of (a). 


We have 


dim null T’ = dim(range T)? 
= dim W — dim range T 
= dim W — (dim V — dimnull T) 
= dimnull T + dim W — dim V, 
where the first equality comes from (a), the second equality comes from 


3.106, and the third equality comes from the Fundamental Theorem of 
Linear Maps (3.22). m 


The next result can be useful because sometimes it is easier to verify that 
T” is injective than to show directly that T is surjective. 


3.108 T surjective is equivalent to T’ injective 


Suppose V and W are finite-dimensional and T € L(V, W). Then T is 
surjective if and only if T’ is injective. 


Proof 


The map T e L(V, W) is surjective if and only if rangeT = W, 


which happens if and only if (range T)? = {0}, which happens if and only if 
null T” = {0} [by 3.107(a)], which happens if and only if T’ is injective. m 


3.109 The range of T’ 
Suppose V and W are finite-dimensional and T € L(V, W). Then 


(a) 
(b) 


dim range T’ = dim range T; 


range T’ = (null T)°. 
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Proof 
(a) We have 


dim range T’ = dim W” — dim null T’ 
= dim W — dim(range T)? 
= dim range T, 
where the first equality comes from the Fundamental Theorem of Linear 


Maps (3.22), the second equality comes from 3.95 and 3.107(a), and 
the third equality comes from 3.106. 


(b) First suppose g € range T’. Thus there exists Yy € W’ such that 
gy = T' (y). If v € null T, then 


gv) = (T'())v = (Y ° TI) = y (Tv) = y0) = 0. 


Hence g € (null 7)°. This implies that range T’ C (null T)?°. 


We will complete the proof by showing that range T’ and (null T)? 
have the same dimension. To do this, note that 


dim range T’ = dimrange T 
= dim V — dim null T 
= dim(null T)°, 
where the first equality comes from (a), the second equality comes from 


the Fundamental Theorem of Linear Maps (3.22), and the third equality 
comes from 3.106. E 


The next result should be compared to 3.108. 


3.110 T injective is equivalent to 7’ surjective 


Suppose V and W are finite-dimensional and T € L(V, W). Then T is 
injective if and only if T’ is surjective. 


Proof The map T e L(V,W) is injective if and only if nullT = {0}, 
which happens if and only if (null T)? = V’, which happens if and only if 
range T’ = V” [by 3.109(b)], which happens if and only if T’ is surjective. m 
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The Matrix of the Dual of a Linear Map 


We now define the transpose of a matrix. 


3.111 Definition transpose, A' 


The transpose of a matrix A, denoted At, is the matrix obtained from 
A by interchanging the rows and columns. More specifically, if A is an 
m-by-n matrix, then A‘ is the n-by-m matrix whose entries are given by 
the equation 


(Ak, j = Aj 
2 ; 5 3 -4 
3.112 Example If A= 3 8 J, then AT = f 
4 2 -7 8 2 


Note that here A is a 3-by-2 matrix and A' is a 2-by-3 matrix. 


The transpose has nice algebraic properties: (A + C)' = At + Ct and 
(A.A)' = AA' for all m-by-n matrices A, C and all A € F (see Exercise 33). 

The next result shows that the transpose of the product of two matrices is 
the product of the transposes in the opposite order. 


3.113 The transpose of the product of matrices 


If A is an m-by-n matrix and C is an n-by-p matrix, then 


AG) aan 


Proof Suppose 1 < k < pand1 < j < m. Then 


(AC), ; = (AC) j,k 
n 
= > Aj, rCr,k 
r=1 


=) (ChA 


r=1 


= (C'A®k,j. 


Thus (AC)' = CtAt, as desired. E 
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The setting for the next result is the assumption that we have a basis 
V1,-.-,Vn Of V, along with its dual basis 1, ..., n of V”. We also have a 
basis w1,..., Wm Of W, along with its dual basis ¥,..., Wm of W”. Thus 
M(T) is computed with respect to the bases just mentioned of V and W, 
and M(T’) is computed with respect to the dual bases just mentioned of W” 
and V”. 


3.114 The matrix of T’ is the transpose of the matrix of T 
Suppose T € L(V, W). Then M(T’) = (M(T))'. 


Proof Let A = M(T) and C = M(T’). Suppose 1 < j < m and 
l<k<n. 
From the definition of M(T’) we have 


n 
T= >. Cen 
r=1 


The left side of the equation above equals Y; o T. Thus applying both sides 
of the equation above to vg gives 


oTo = D> Cr Gre) 


r=1 


= Ck,j à 
We also have 


(Wj o T)(vk) = Yj (Tvk) 


= Wj (© Arwr) 


r=1 


= Ý Arne; Wr) 


Comparing the last line of the last two sets of equations, we have Cy, ; = Á j,k- 
Thus C = At. In other words, M(T’) = (M (T)), as desired. E 
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The Rank of a Matrix 


We begin by defining two nonnegative integers that are associated with each 
matrix. 


3.115 Definition row rank, column rank 


Suppose A is an m-by-n matrix with entries in F. 


e The row rank of A is the dimension of the span of the rows of A in 


EE, 
e The column rank of A is the dimension of the span of the columns 
of A in F™1, 
471 8 : 
3.116 Example Suppose A = 3529) Find the row rank of A 


and the column rank of A. 


Solution The row rank of A is the dimension of 
span(( 4 7 1 8).(3 5 2 9)) 


in F14, Neither of the two vectors listed above in F!\4 is a scalar multiple 
of the other. Thus the span of this list of length 2 has dimension 2. In other 
words, the row rank of A is 2. 

The column rank of A is the dimension of 


v((5)-2)-G)-G)) 


in F?-!, Neither of the first two vectors listed above in F>! is a scalar multiple 
of the other. Thus the span of this list of length 4 has dimension at least 2. 
The span of this list of vectors in F>! cannot have dimension larger than 2 
because dim F>! = 2. Thus the span of this list has dimension 2. In other 
words, the column rank of A is 2. 


Notice that no bases are in sight in the statement of the next result. Al- 
though M (T) in the next result depends on a choice of bases of V and W, 
the next result shows that the column rank of M (T) is the same for all such 
choices (because range T does not depend on a choice of basis). 
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3.117 Dimension of range T equals column rank of M(T) 


Suppose V and W are finite-dimensional and T € L(V,W). Then 
dim range T equals the column rank of M(T). 


Proof Suppose v1, ...,Vn is a basis of V and w1, ..., wm isa basis of W. The 
function that takes w € span(7'v1,..., Tvn) to M(w) is easily seen to be an 
isomorphism from span(Tv1,..., Tvn) onto span(M(Tv1), Te ,M(Tvn)). 
Thus dim span(Tv1,..., Tvn) = dim span(M (Tv1) pes M(Tvn)), where 
the last dimension equals the column rank of M(T). 


It is easy to see that range 7 = span(7'v1,...,7v,). Thus we have 
dimrange T = dim span(Tv1,..., Tvn) = the column rank of M(T), as 
desired. E 


In Example 3.116, the row rank and column rank turned out to equal each 
other. The next result shows that this always happens. 


3.118 Row rank equals column rank 


Suppose A € F™”:”. Then the row rank of A equals the column rank of A. 


Proof Define T: F”! + F™! by Tx = Ax. Thus M(T) = A, where 
M(T) is computed with respect to the standard bases of F”*! and F1. Now 
column rank of A = column rank of M(T) 

= dimrange T 

= dim range T’ 

= column rank of M(T’) 

= column rank of A‘ 

= row rank of A, 
where the second equality above comes from 3.117, the third equality comes 
from 3.109(a), the fourth equality comes from 3.117 (where M(T’) is com- 


puted with respect to the dual bases of the standard bases), the fifth equality 
comes from 3.114, and the last equality follows easily from the definitions. m 


The last result allows us to dispense with the terms “row rank” and “column 
rank” and just use the simpler term “rank”. 


3.119 Definition rank 


The rank of a matrix A € F”” is the column rank of A. 
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EXERCISES 3.F 


10 


Explain why every linear functional is either surjective or the zero map. 
Give three distinct examples of linear functionals on R01), 


Suppose V is finite-dimensional and v € V with v Æ 0. Prove that there 
exists ¢ € V” such that g(v) = 1. 


Suppose V is finite-dimensional and U is a subspace of V such that 
U Æ V. Prove that there exists g € V” such that g(u) = 0 for every 
u € U but ọ £0. 


Suppose V1,..., Vn are vector spaces. Prove that (V1 x --- x Vm} and 
V1” X --- X Vm’ are isomorphic vector spaces. 


Suppose V is finite-dimensional and v1,...,vm E V. Define a linear 
map T: V’ > F” by 


ro) = ((11),---, 9m). 


(a) Prove that v1, ...,Vm spans V if and only if T is injective. 
(b) Prove that v1,..., Vm is linearly independent if and only if I is 
surjective. 


Suppose m is a positive integer. Show that the dual basis of the basis 
D usus x” of Pm(R) is po, 91, . . . , m, Where g; (p) = 2u Here 


p) denotes the j" derivative of p, with the understanding that the 0" 
derivative of p is p. 


Suppose 7m is a positive integer. 


(a) Show that 1,x —5,...,(x — 5)” is a basis of Pm (R). 
(b) | What is the dual basis of the basis in part (a)? 


Suppose vq,..., vn is a basis of V and ¢1,..., Øn is the corresponding 
dual basis of V’. Suppose y € V”. Prove that 


y = Y (vipi +++: + Wn) Gn. 


Prove the first two bullet points in 3.101. 
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Suppose A is an m-by-n matrix with A Æ 0. Prove that the rank of A 
is 1 if and only if there exist (c1,..., Cm) € F” and (d),..., dn) € F” 
such that Aj, = cjd, for every j = 1,...,m and every k = 1,...,n. 


Show that the dual map of the identity map on V is the identity map 
on V”. 


Define T: R? — R? by T(x, y, z) = (4x + 5y + 6z, 7x + 8y + 9z). 
Suppose g1, #2 denotes the dual basis of the standard basis of R? and 
W1, W2, Y3 denotes the dual basis of the standard basis of R°. 


(a) Describe the linear functionals T’ (91) and T’ (¢2). 
(b) Write T'(ọ1) and T’ (ọ2) as linear combinations of Y1, Y2, Y3. 
Define T : P(R) > PR) by (Tp)(x) = x? p(x) + p” (x) for x € R. 


(a) Suppose g € P(R)’ is defined by g(p) = p’(4). Describe the 
linear functional T’(g) on P(R). 


(b) Suppose g € P(R)’ is defined by o(p) = i; p(x) dx. Evaluate 
(T’(g)) (x). 


Suppose W is finite-dimensional and T € L(V, W). Prove that T’ = 0 
if and only if T = 0. 


Suppose V and W are finite-dimensional. Prove that the map that takes 
T € L(V, W) to T’ e L(W’, V’) is an isomorphism of L(V, W) onto 
L(W', V’). 


Suppose U C V. Explain why U? = {ọ € V’ : U C null o}. 


Suppose V is finite-dimensional and U C V. Show that U = {0} if and 
only if U? = V”. 


Suppose V is finite-dimensional and U is a subspace of V. Show that 
U = V if and only if U? = {0}. 


Suppose U and W are subsets of V with U C W. Prove that W? c U®?. 


Suppose V is finite-dimensional and U and W are subspaces of V with 
W°? c U?. Prove that U C W. 


Suppose U, W are subspaces of V. Show that (U + W)? = U? n W°. 
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Suppose V is finite-dimensional and U and W are subspaces of V. Prove 
that (UN W)? = U? + W°. 


Prove 3.106 using the ideas sketched in the discussion before the state- 
ment of 3.106. 


Suppose V is finite-dimensional and U is a subspace of V. Show that 
U = {v € V : ov) = 0 for every g € US. 
Suppose V is finite-dimensional and T is a subspace of V’. Show that 


T = {v € V : ov) = 0 for every g € T}. 


Suppose T € L(Ps(R), Ps(R)) and null T” = span(g), where ¢ is 
the linear functional on P5(R) defined by g(p) = p(8). Prove that 
range T = {p € Ps(R) : p(8) = 0}. 


Suppose V and W are finite-dimensional, T € L(V, W), and there exists 
g € W” such that null T” = span(g). Prove that range T = null g. 


Suppose V and W are finite-dimensional, T € L(V, W), and there exists 
gy € V’ such that range T’ = span(g). Prove that null T = null ọ. 


Suppose V is finite-dimensional and ¢1,..., @m 1s a linearly independent 
list in V’. Prove that 


dim ((null gı) A- N (null Pm)) = (dim V) — m. 


Suppose V is finite-dimensional and g1, ..., n is a basis of V’. Show 
that there exists a basis of V whose dual basis is 91,..., Øn. 
Suppose T € L(V), and u1,..., un and v1, ...,Vn are bases of V. Prove 


that the following are equivalent: 


(a) T is invertible. 

(b) The columns of M (T) are linearly independent in F”-1. 
(c) The columns of M (T) span F”!. 

(d) The rows of M (T) are linearly independent in F1”. 

(e) The rows of M (T) span F1”. 


Here M(T) means M (T, (u1, ..., Un), (V1,-..,Un)). 
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Suppose m and n are positive integers. Prove that the function that takes 
Ato A'is a linear map from F”*” to F”:”, Furthermore, prove that this 
linear map is invertible. 


The double dual space of V, denoted V”, is defined to be the dual space 
of V’. In other words, V” = (V’)’. Define A: V > V” by 


(Av)(g) = p0) 
forve V andọ € V”. 


(a) Show that A is a linear map from V to V”. 

(b) Show thatif T € L(V), then T” o A = A o T, where T” = (T'Y. 

(c) Show that if V is finite-dimensional, then A is an isomorphism 
from V onto V”. 


[Suppose V is finite-dimensional. Then V and V' are isomorphic, but 
finding an isomorphism from V onto V’ generally requires choosing a 
basis of V. In contrast, the isomorphism A from V onto V" does not 
require a choice of basis and thus is considered more natural. | 


Show that (PR) and R° are isomorphic. 


Suppose U is a subspace of V. Let i: U — V be the inclusion map 
defined by i (u) = u. Thus i’ € L(V’, U’). 

(a) Show that nulli’ = U®. 

(b) Prove that if V is finite-dimensional, then range i’ = U’. 

(c) Prove that if V is finite-dimensional, then i’ is an isomorphism 


from V’/U® onto U”. 


[The isomorphism in part (c) is natural in that it does not depend on a 
choice of basis in either vector space. ] 


Suppose U is a subspace of V. Let x: V > V/U be the usual quotient 
map. Thus x’ € L((V/UY’, V’). 

(a) Show that z’ is injective. 

(b) Show that range x’ = U®. 

(c) Conclude that z’ is an isomorphism from (V/U)’ onto U®. 

[The isomorphism in part (c) is natural in that it does not depend on a 


choice of basis in either vector space. In fact, there is no assumption 
here that any of these vector spaces are finite-dimensional. | 
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Polynomials 


This short chapter contains material on polynomials that we will need to 
understand operators. Many of the results in this chapter will already be 
familiar to you from other courses; they are included here for completeness. 

Because this chapter is not about linear algebra, your instructor may go 
through it rapidly. You may not be asked to scrutinize all the proofs. Make 
sure, however, that you at least read and understand the statements of all the 
results in this chapter—they will be used in later chapters. 

The standing assumption we need for this chapter is as follows: 


41 Notation F 
F denotes R or C. 


LEARNING OBJECTIVES FOR THIS CHAPTER 
m Division Algorithm for Polynomials 
m factorization of polynomials over C 
m factorization of polynomials over R 
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Complex Conjugate and Absolute Value 


Before discussing polynomials with complex or real coefficients, we need to 
learn a bit more about the complex numbers. 


4.2 Definition Rez, Imz 


Suppose z = a + bi, where a and b are real numbers. 
e The real part of z, denoted Re z, is defined by Rez = a. 


e The imaginary part of z, denoted Im z, is defined by Imz = b. 


Thus for every complex number z, we have 


z = Rez + (Imz)i. 


4.3 Definition complex conjugate, Z, absolute value, |z| 
Suppose z € C. 
e The complex conjugate of z € C, denoted Z, is defined by 
Z = Rez — (Im 2)i. 


e The absolute value of a complex number z, denoted |z|, is defined 
by 


Hls Rez)? + (Imz)?. 


4.4 Example Suppose z = 3 + 2i. Then 
e Rez = 3 and Imz = 2; 
eZ7=3-2i; 


o |z| = V32 422 = V13. 


Note that |z| is a nonnegative number for every z € C. 
You should verify that z = Z if and The real and imaginary parts, com- 
only if z is a real number. plex conjugate, and absolute value have 
m * the following properties: 
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4.5 Properties of complex numbers 


Suppose w, z € C. Then 


sum of z and Z 

262 =?Rez 
difference of z and Z 

z—Z=2(Imz)i; 
product of z and z 

zz = les 


additivity and multiplicativity of complex conjugate 
w+z =w +Z and wz = wz; 


conjugate of conjugate 
AS 


real and imaginary parts are bounded by |z| 
|Rez| < |z| and |Imz]| < |z| 

absolute value of the complex conjugate 
Z| = Izl; 

multiplicativity of absolute value 
[wz] = Įwl |z|; 

Triangle Inequality 


|w + z| < |w] + Izl. aii 
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Proof Except for the last item, the routine verifications of the assertions 


above are left to the reader. To verify the last item, we have 
lw + 2|? = (w+ z)w +2) 
= ww + ZZ + wZ + Zw 
= |w|? + |z|? + wz + wz 
= |w|? + |z|? + 2Re(w2) 
< |w]? + |z|? + 2|w2| 
= |w|? + |z|? + 2ļw] |z| 
= (|| + 2). 


Taking the square root of both sides of the inequality |w + z|? < (\w| + |z|)? 


now gives the desired inequality. 
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Uniqueness of Coefficients for Polynomials 


Recall that a function p: F — F is called a polynomial with coefficients in F 


if there exist dg,...,@m € F such that 
4.6 p(z) = ao + 41Z + a22? +++» + amz” 
for all z € F. 


4.7 Ifa polynomial is the zero function, then all coefficients are 0 


Suppose ao, ...,am E F. If 
aol a ee a = (0) 


for every z € F, then ao = -+ = am = 0. 


Proof We will prove the contrapositive. If not all the coefficients are 0, then 
by changing m we can assume am # 0. Let 


Ia jao| + |ai| +--+ + lam-1| 


+1. 
lam| 


Note that z > 1, and thus z/ < z”7! for j = 0,1,...,m—1. Using the 
Triangle Inequality, we have 


lao tayz +-+ + m—12™"—*| < (lao| + lai] +++ + |am—1))2""" 


ie 


Thus ao + ayzZ +++: + Gm—1z"—! 4 —amz™. Hence we conclude that 
ao + ayzZ +e + am-1Z”7! + amz” Æ 0. = 


The result above implies that the coefficients of a polynomial are uniquely 
determined (because if a polynomial had two different sets of coefficients, 
then subtracting the two representations of the polynomial would give a 
contradiction to the result above). 

Recall that if a polynomial p can be written in the form 4.6 with am ¥ 0, 
then we say that p has degree m and we write deg p = m. 


The 0 polynomial is declared to The degree of the 0 polynomial is 
have degree —oo so that excep- defined to be —oco. When necessary, use 
tions are not needed for various the obvious arithmetic with —oo. For 
reasonable results. For example, example, —co < mand —co+m = 
deg(pq) = deg p + deg q even if —oo for every integer m. 

p=0. 
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The Division Algorithm for Polynomials 


If p and s are nonnegative integers, with s Æ 0, then there exist nonnegative 
integers q and r such that 

p=sqtr 
andr < s. Think of dividing p by s, getting quotient q with remainder r. Our 
next task is to prove an analogous result for polynomials. 


The result below is often called the [Think of the Division Algorithm for 
Division Algorithm for Polynomials, al- | Polynomials as giving the remain- 
though as stated here it is not really an | der r when p is divided by s. 
algorithm, just a useful result. — — s 

Recall that P(F) denotes the vector space of all polynomials with co- 
efficients in F and that Pm(F) is the subspace of P(F) consisting of the 
polynomials with coefficients in F and degree at most m. 

The next result can be proved without linear algebra, but the proof given 
here using linear algebra is appropriate for a linear algebra textbook. 


=] 


4.8 Division Algorithm for Polynomials 


Suppose that p,s € P(E), with s # 0. Then there exist unique 
polynomials q,r € P(F) such that 


P= sof 
and degr < degs. 


Proof Letn = deg p and m = deg s. Ifn < m, then take q = 0 and r = p 
to get the desired result. Thus we can assume that n > m. 
Define T : Pa-m (Œ) x Pm—1(F) > Pph (Œ) by 


T(q,r) = sq +r. 


The reader can easily verify that T is a linear map. If (q,r) € null T, then 
sq +r = 0, which implies that q = 0 and r = 0 [because otherwise 
deg sq > m and thus sq cannot equal —r]. Thus dim null T = 0 (proving the 
“unique” part of the result). 

From 3.76 we have 


dim( Pa-m (Œ) x Pm-1(F)) =(n-m+1)+(m—-1+1)=n+l1. 


The Fundamental Theorem of Linear Maps (3.22) and the equation displayed 
above now imply that dim range T = n + 1, which equals dim P, (F). Thus 
range T = Pa (F), and hence there exist q € Py—m(F) and r € Pm—1(F) 
such that p = T (q,r) = sq +r. m 
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Zeros of Polynomials 
The solutions to the equation p(z) = 0 play a crucial role in the study of a 
polynomial p € P(F). Thus these solutions have a special name. 

4.9 Definition zero of a polynomial 


A number A € F is called a zero (or root) of a polynomial p € P(F) if 
pia) =0. 


4.10 Definition factor 
A polynomial s € P(F) is called a factor of p € P(F) if there exists a 
polynomial q € P(F) such that p = sq. 


We begin by showing that A is a zero of a polynomial p € P(F) if and 
only if z — A is a factor of p. 


4.11 Each zero of a polynomial corresponds to a degree-1 factor 
Suppose p € P(F) and A € F. Then p(A) = O if and only if there is a 
polynomial q € P(F) such that 

p(z) = (Z —A)q(z) 


for every z € F. 


Proof One direction is obvious. Namely, suppose there is a polynomial 
q € P(F) such that p(z) = (z — A)q(z) for all z € F. Then 


PA) = (A—A)q(A) =0, 


as desired. 

To prove the other direction, suppose p(A) = 0. The polynomial z — À 
has degree 1. Because a polynomial with degree less than 1 is a constant 
function, the Division Algorithm for Polynomials (4.8) implies that there exist 
a polynomial q € P (F) and a number r € F such that 


pE) = (2-A)q(z) +r 


for every z € F. The equation above and the equation p(A) = 0 imply that 
r = 0. Thus p(z) = (z — à)q (z) for every z € F. E 


Now we can prove that polynomials do not have too many zeros. 
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4.12 A polynomial has at most as many zeros as its degree 


Suppose p € P(F) is a polynomial with degree m > 0. Then p has at 
most m distinct zeros in F. 


Proof Ifm = 0, then p(z) = ao Æ 0 and so p has no zeros. 

If m = 1, then p(z) = ao + aız, with a; Æ 0, and thus p has exactly 
one zero, namely, —do/d1. 

Now suppose m > 1. We use induction on m, assuming that every 
polynomial with degree m — 1 has at most m — 1 distinct zeros. If p has no 
zeros in F, then we are done. If p has a zero A € F, then by 4.11 there is a 
polynomial q such that 


p) = @-A)q@) 


for all z € F. Clearly degq = m — 1. The equation above shows that if 
p(z) = 0, then either z = A or g(z) = 0. In other words, the zeros of p 
consist of A and the zeros of q. By our induction hypothesis, q has at most 
m — | distinct zeros in F. Thus p has at most m distinct zeros in F. m 


Factorization of Polynomials over C 


So far we have been handling polynomials with complex coefficients and 
polynomials with real coefficients simultaneously through our convention that 
F denotes R or C. Now we will see some differences between these two cases. 
First we treat polynomials with complex coefficients. Then we will use our 
results about polynomials with complex coefficients to prove corresponding 
results for polynomials with real coefficients. 

The next result, although called the [The Fundamental Theorem of Al- 
Fundamental Theorem of Algebra, uses | gebra is an existence theorem. Its 
analysis its proof. The short proof pre- | proof does not lead to a method for 
sented here uses tools from complex |finding zeros. The quadratic for- 
analysis. If you have not had a coursein | mwla gives the zeros explicitly for 
complex analysis, this proof will almost |P0/Ynemials of degree 2. Similar 

: i but more complicated formulas ex- 
certainly be meaningless to you. In that |. : 

: ist for polynomials of degree 3 and 
case, just accept the Fundamental The- | 4 No such formulas exist for poly- 
orem of Algebra as something that we | nomials of degree 5 and above. 
need to use but whose proof requires = 
more advanced tools that you may learn 
in later courses. 
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4.13 Fundamental Theorem of Algebra 


Every nonconstant polynomial with complex coefficients has a zero. 


Proof Let p be a nonconstant polynomial with complex coefficients. Sup- 
pose p has no zeros. Then 1/p is an analytic function on C. Furthermore, 
| p(z)| — œ as |z| > œœ, which implies that 1/p — 0 as |z| — oo. Thus 
1/p is a bounded analytic function on C. By Liouville’s theorem, every such 
function is constant. But if 1/p is constant, then p is constant, contradicting 
our assumption that p is nonconstant. E 


Although the proof given above is probably the shortest proof of the 
Fundamental Theorem of Algebra, a web search can lead you to several other 
proofs that use different techniques. All proofs of the Fundamental Theorem 
of Algebra need to use some analysis, because the result is not true if C is 
replaced, for example, with the set of numbers of the form c + di where c, d 


are rational numbers. 


The cubic formula, which was 
discovered in the 16% century, 
is presented below for your 
amusement only. Do not memorize 
it. 


Suppose 
p(x) = ax? + bx? +x +d, 


where a # 0. Set 


9abc — 2b? — 27a2d 
u = 


54a? 
and then set 
2 ee — a 
v=u ——~——} . 
9a2 


Suppose v > 0. Then 


b 3 3 
-5 + ut V4 u — yv 


is a zero of p. 


Remarkably, mathematicians have 
proved that no formula exists for the ze- 
ros of polynomials of degree 5 or higher. 
But computers and calculators can use 
clever numerical methods to find good 
approximations to the zeros of any poly- 
nomial, even when exact zeros cannot 
be found. 

For example, no one will ever be 
able to give an exact formula for a zero 
of the polynomial p defined by 


p(x) = x?—5xt—6x? +17x?+4x-—7. 


However, a computer or symbolic cal- 
culator can find approximate zeros of 
this polynomial. 

The Fundamental Theorem of Alge- 
bra leads to the following factorization 
result for polynomials with complex co- 
efficients. Note that in this factorization, 
the numbers 41, ..., Am are precisely 
the zeros of p, for these are the only 
values of z for which the right side of 
the equation in the next result equals 0. 
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4.14 Factorization of a polynomial over C 


If p € P(C) is a nonconstant polynomial, then p has a unique factoriza- 
tion (except for the order of the factors) of the form 


p(z) = c(Z—A1)---(@ — Am), 


where c,A1,...,Am E C. 


Proof Let p € P(C) and let m = deg p. We will use induction on m. If 
m = 1, then clearly the desired factorization exists and is unique. So assume 
that m > 1 and that the desired factorization exists and is unique for all 
polynomials of degree m — 1. 

First we will show that the desired factorization of p exists. By the 
Fundamental Theorem of Algebra (4.13), p has a zero A. By 4.11, there is a 
polynomial q such that 


pE) = (Z—-A)q@) 


for all z € C. Because deg q = m — 1, our induction hypothesis implies that 
q has the desired factorization, which when plugged into the equation above 
gives the desired factorization of p. 

Now we turn to the question of uniqueness. Clearly c is uniquely deter- 
mined as the coefficient of z” in p. So we need only show that except for the 
order, there is only one way to choose A1,..., Am. If 


(z =A1)**@=Am) = (Z — T1) + (Z — Tm) 


for all z € C, then because the left side of the equation above equals 0 when 
z = À1, one of the t’s on the right side equals 4;. Relabeling, we can assume 
that t1 = Ay. Now for z Æ A1, we can divide both sides of the equation 
above by z — Aj, getting 


(z —A2)-+-(Z —Am) = (Z — T2) ++: (Z — Tm) 


for all z € C except possibly z = A,. Actually the equation above holds 
for all z € C, because otherwise by subtracting the right side from the left 
side we would get a nonzero polynomial that has infinitely many zeros. The 
equation above and our induction hypothesis imply that except for the order, 
the A’s are the same as the t’s, completing the proof of uniqueness. E 
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Factorization of Polynomials over R 


The failure of the Fundamental 
Theorem of Algebra for R accounts 
for the differences between oper- 
ators on real and complex vector 
spaces, as we will see in later 
chapters. 


A polynomial with real coefficients may 
have no real zeros. For example, the 
polynomial 1 + x? has no real zeros. 

To obtain a factorization theorem 
over R, we will use our factorization 
theorem over C. We begin with the fol- 
lowing result. 


4.15 Polynomials with real coefficients have zeros in pairs 


Suppose p € P(C) is a polynomial with real coefficients. If A € C isa 


zero of p, then so is À. 


Proof Let 


p(z) = ao +a1zZ +++: + amz”, 


where ao, . . . , am are real numbers. Suppose À € C is a zero of p. Then 


do tayA +--+ +amà” =0. 


Take the complex conjugate of both sides of this equation, obtaining 


bail te A = 6, 


where we have used basic properties of complex conjugation (see 4.5). The 
equation above shows that À is a zero of p. E 


Think about the connection be- 
tween the quadratic formula and 
4.16. 


We want a factorization theorem for 
polynomials with real coefficients. First 
we need to characterize the polynomi- 
als of degree 2 with real coefficients 
that can be written as the product of 
two polynomials of degree 1 with real 
coefficients. 


4.16 Factorization of a quadratic polynomial 


Suppose b,c € R. Then there is a polynomial factorization of the form 


x ie e = (Aa = Aa) 


with 1, å2 € R if and only if b? > 4c. 
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Proof Notice that 


tortem (+t) + (e-2), 


First suppose b? < 4c. Then clearly [The equation above is the basis 
the right side of the equation above is | of the technique called completing 
positive for every x € R. Hence the | the square. 
polynomial x? + bx + c has no real * 
zeros and thus cannot be factored in the 
form (x —A1)(x —Az2) with Ay, A2 ER. 

Conversely, now suppose b? > 4c. Then there is a real number d such 


that d? = Ze — c. From the displayed equation above, we have 


x tbrte= (x+? -0 


b b 
=(x+5+a)(x+5-d), 
which gives the desired factorization. E 


The next result gives a factorization of a polynomial over R. The idea of 
the proof is to use the factorization 4.14 of p as a polynomial with complex 
coefficients. Complex but nonreal zeros of p come in pairs; see 4.15. Thus 
if the factorization of p as an element of P(C) includes terms of the form 
(x — A) with A a nonreal complex number, then (x — A) is also a term in the 
factorization. Multiplying together these two terms, we get 


(x? — 2(ReA)x + |A|7), 


which is a quadratic term of the required form. 

The idea sketched in the paragraph above almost provides a proof of the 
existence of our desired factorization. However, we need to be careful about 
one point. Suppose A is a nonreal complex number and (x — A) is a term in 
the factorization of p as an element of P(C). We are guaranteed by 4.15 that 
(x — A) also appears as a term in the factorization, but 4.15 does not state that 
these two factors appear the same number of times, as needed to make the 
idea above work. However, the proof works around this point. 

In the next result, either m or M may equal 0. The numbers Ay1,..., Am 
are precisely the real zeros of p, for these are the only real values of x for 
which the right side of the equation in the next result equals 0. 
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4.17 Factorization of a polynomial over R 


Suppose p € P(R) is a nonconstant polynomial. Then p has a unique 
factorization (except for the order of the factors) of the form 


P(x) = (x —A1)+++(& —Am)(x* + dix D (x? + bux + cm), 


where c,A1,...,4m,61,...,6m,¢1,-..,¢m E R, with bj? < 4c; for 
each j. 


Proof Think of p as an element of P (C). If all the (complex) zeros of p are 
real, then we are done by 4.14. Thus suppose p has a zero A € C with À ¢ R. 
By 4.15, A is a zero of p. Thus we can write 


P(x) = (x — A(x — q(x) 
= (x? — 2(ReA)x + |A|?)q(x) 


for some polynomial q € P(C) with degree two less than the degree of p. 
If we can prove that q has real coefficients, then by using induction on the 
degree of p, we can conclude that (x — À) appears in the factorization of p 
exactly as many times as (x — A). 

To prove that q has real coefficients, we solve the equation above for q, 
getting 

p(x) 

x? —2(ReA)x + |A|? 
for all x € R. The equation above implies that g(x) € R forall x € R. 
Writing 


q(x) = 


q(x) = ag + a,x +-+- + an-2X"7?, 


where n = deg p and do,...,dn—2 € C, we thus have 
0 = Imq(x) = (Imao) + (Ima1)x +--- + (Imap_—2)x” 7 


for all x € R. This implies that Imao,..., Im an-2 all equal 0 (by 4.7). Thus 
all the coefficients of q are real, as desired. Hence the desired factorization 
exists. 

Now we turn to the question of uniqueness of our factorization. A factor 
of p of the form x? + bjx + cj with bj? < 4c; can be uniquely written 
as (x —Aj;)(x — p) with A; € C. A moment’s thought shows that two 
different factorizations of p as an element of P(R) would lead to two different 
factorizations of p as an element of P(C), contradicting 4.14. E 
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EXERCISES 4 


1 Verify all the assertions in 4.5 except the last one. 


2 Suppose m is a positive integer. Is the set 
{0} U {p € PŒ) : deg p = m; 
a subspace of P(F)? 


3 Is the set 
{0} U {p € P(F) : deg p is even} 


a subspace of P(F)? 


4 Suppose m and n are positive integers with m < n, and suppose 
Ài., Àm € F. Prove that there exists a polynomial p € P(F) with 
deg p = n such that 0 = p(A1) = --- = p(Am) and such that p has no 
other zeros. 


5 Suppose m is a nonnegative integer, Z1,..., Zm-+1 are distinct elements 
of F, and w1,...,wm+1 € F. Prove that there exists a unique polynomial 
p € Pm(F) such that 


P(Zj) = Wj 
for j =1,...,m + 1. 
[This result can be proved without using linear algebra. However, try to 
find the clearer, shorter proof that uses some linear algebra. | 


6 Suppose p € P(C) has degree m. Prove that p has m distinct zeros if 
and only if p and its derivative p’ have no zeros in common. 


7 Prove that every polynomial of odd degree with real coefficients has a 
real zero. 


8 Define T: P(R) > RÈ by 
p— P(3) 


p'(3) if x = 3. 


if se 3, 


Show that Tp € P(R) for every polynomial p € P(R) and that T is a 
linear map. 
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Suppose p € P(C). Define q: C > C by 


q(z) = p(z)p(2). 
Prove that q is a polynomial with real coefficients. 


Suppose m is a nonnegative integer and p € Pm(C) is such that there 
exist distinct real numbers xo, x1,...,Xm such that p(x;) € R for 
J =0,1,..., m. Prove that all the coefficients of p are real. 


Suppose p € P(F) with p Æ 0. Let U = {pq : q E P(F)}. 


(a) Show that dim P(F)/U = deg p. 
(b) Find a basis of dim P(F)/U. 
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Statue of Italian mathematician 
Leonardo of Pisa (1170-1250, 
approximate dates), also known as 
Fibonacci. Exercise 16 in Section 5.C 
shows how linear algebra can be used 
to find an explicit formula for the 
Fibonacci sequence. 


Eigenvalues, Eigenvectors, and 
Invariant Subspaces 


Linear maps from one vector space to another vector space were the objects 
of study in Chapter 3. Now we begin our investigation of linear maps from 
a finite-dimensional vector space to itself. Their study constitutes the most 
important part of linear algebra. 

Our standing assumptions are as follows: 


51 Notation F, V 
e F denotes R or C. 


e V denotes a vector space over F. 


LEARNING OBJECTIVES FOR THIS CHAPTER 
m invariant subspaces 
m eigenvalues, eigenvectors, and eigenspaces 


m each operator on a finite-dimensional complex vector space has an 
eigenvalue and an upper-triangular matrix with respect to some 
basis 
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5.A Invariant Subspaces 


In this chapter we develop the tools that will help us understand the structure 
of operators. Recall that an operator is a linear map from a vector space to 
itself. Recall also that we denote the set of operators on V by L(V); in other 
words, L(V) = L(V, V). 

Let’s see how we might better understand what an operator looks like. 
Suppose T € L(V). If we have a direct sum decomposition 


V=U,6::-@Um, 


where each U; is a proper subspace of V, then to understand the behavior of 
T, we need only understand the behavior of each T|y,; here T|y, denotes 
the restriction of T to the smaller domain U;. Dealing with T|y, should be 
easier than dealing with T because U; is a smaller vector space than V. 

However, if we intend to apply tools useful in the study of operators (such 
as taking powers), then we have a problem: T|y, may not map U; into itself; 
in other words, T|y, may not be an operator on Uj. Thus we are led to 
consider only decompositions of V of the form above where T maps each U; 
into itself. 

The notion of a subspace that gets mapped into itself is sufficiently impor- 
tant to deserve a name. 


5.2 Definition invariant subspace 


Suppose T € L(V). A subspace U of V is called invariant under T if 
u € U implies Tu € U. 


In other words, U is invariant under T if T |y is an operator on U. 


5.3 Example Suppose T € £(V). Show that each of the following 
subspaces of V is invariant under T: 


(a) {0}; The most famous unsolved problem | 
in functional analysis is called the | 
(b) V; invariant subspace problem.  It| 
deals with invariant subspaces of | 


(c) nullT; operators on infinite-dimensional 
vector spaces. 


(d) range 7. 
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Solution 


(a) Ifu € {0}, then u = 0 and hence Tu = 0 € {0}. Thus {0} is invariant 
under T. 


(b) IfueV,then Tu € V. Thus V is invariant under T. 


(c) Ifu € nullT, then Tu = 0, and hence Tu € null T. Thus null T is 
invariant under T. 


(d) Ifu € range 7, then Tu € range T. Thus range T is invariant under T. 


Must an operator T € L(V) have any invariant subspaces other than {0} 
and V? Later we will see that this question has an affirmative answer if V is 
finite-dimensional and dim V > 1 (for F = C) or dim V > 2 (for F = R); 
see 5.21 and 9.8. 

Although null T and range T are invariant under T, they do not necessarily 
provide easy answers to the question about the existence of invariant subspaces 
other than {0} and V, because null T may equal {0} and range T may equal 
V (this happens when T is invertible). 


5.4 Example Suppose that T € L(P(R)) is defined by Tp = p’. 
Then 74(R), which is a subspace of P(R), is invariant under T because 
if p € P(R) has degree at most 4, then p’ also has degree at most 4. 


Eigenvalues and Eigenvectors 


We will return later to a deeper study of invariant subspaces. Now we turn to an 
investigation of the simplest possible nontrivial invariant subspaces—invariant 
subspaces with dimension 1. 
Take any v € V with v Æ 0 and let U equal the set of all scalar multiples 
of v: 
U = {Av : à € F} = span(v). 


Then U is a 1-dimensional subspace of V (and every 1-dimensional subspace 
of V is of this form for an appropriate choice of v). If U is invariant under an 
operator T € L(V), then Tv € U, and hence there is a scalar A € F such that 


To= Ay: 


Conversely, if Tv = Av for some A € F, then span(v) is a 1-dimensional 
subspace of V invariant under 7. 
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The equation 
Tv= Av, 


which we have just seen is intimately connected with 1-dimensional invariant 
subspaces, is important enough that the vectors v and scalars À satisfying it 
are given special names. 


55 Definition eigenvalue 


Suppose T € L(V). A number À € F is called an eigenvalue of T if 
there exists v € V such that v Æ 0 and Tv = Av. 


The word eigenvalue is half- The comments above show that T 
German, half-English. The Ger- has a 1-dimensional invariant subspace 
man adjective eigen means “own” if and only if T has an eigenvalue. 

in the sense of characterizing an in- In the definition above, we require 
trinsic property. Some mathemati- that v Æ 0 because every scalar A € F 
cians use the term characteristic satisfies T0 = 10. 

value instead of eigenvalue. 


5.6 Equivalent conditions to be an eigenvalue 


Suppose V is finite-dimensional, T € L(V), and A € F. Then the 
following are equivalent: 


(a) Ais an eigenvalue of T; 


Recall that I € L(V) is the iden- 
tity operator defined by Iv = v for 
all v € V. 


(b) T — ìl is not injective; 


(c) T — ìl is not surjective; 


(d) T — ål is not invertible. 


Proof Conditions (a) and (b) are equivalent because the equation Tv = Av 
is equivalent to the equation (T — AJ )v = 0. Conditions (b), (c), and (d) are 
equivalent by 3.69. 7 


57 Definition eigenvector 


Suppose T € L(V) and À € F is an eigenvalue of T. A vector v € V is 
called an eigenvector of T corresponding to À if v 4 0 and Tv = Av. 
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Because Tv = Av if and only if (T — AI )v = 0, a vector v € V with v 4 0 


is an eigenvector of T corresponding to A if and only if v € null(T — AJ). 


5.8 Example Suppose T € L(F?) is defined by 


T(w, Z) = (—z, w). 


(a) Find the eigenvalues and eigenvectors of T if F = R. 

(b) Find the eigenvalues and eigenvectors of T if F = C. 

Solution 

(a) If F = R, then T is a counterclockwise rotation by 90° about the 
origin in R?. An operator has an eigenvalue if and only if there exists a 
nonzero vector in its domain that gets sent by the operator to a scalar 
multiple of itself. A 90° counterclockwise rotation of a nonzero vector 
in R? obviously never equals a scalar multiple of itself. Conclusion: if 
F = R, then T has no eigenvalues (and thus has no eigenvectors). 

(b) To find eigenvalues of T, we must find the scalars À such that 


T(w, Zz) = A(w, z) 


has some solution other than w = z = 0. The equation above is 
equivalent to the simultaneous equations 


5.9 —z = àw, w=dXz. 


Substituting the value for w given by the second equation into the first 
equation gives 
-=z = }*z, 


Now z cannot equal 0 [otherwise 5.9 implies that w = 0; we are 
looking for solutions to 5.9 where (w, z) is not the 0 vector], so the 
equation above leads to the equation 


-1 =)’. 


The solutions to this equation are A = i and A = —i. You should 
be able to verify easily that i and —i are eigenvalues of T. Indeed, 
the eigenvectors corresponding to the eigenvalue i are the vectors of 
the form (w, —wi), with w € C and w Æ 0, and the eigenvectors 
corresponding to the eigenvalue —i are the vectors of the form (w, wi), 
with w € C and w Æ 0. 
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Now we show that eigenvectors corresponding to distinct eigenvalues are 
linearly independent. 


5.10 Linearly independent eigenvectors 


Let T € L(V). Suppose A1,...,Am are distinct eigenvalues of T and 


V1,..-,Vm are corresponding eigenvectors. Then v1,..., Vm is linearly 
independent. 
Proof Suppose v1,...,Vm is linearly dependent. Let k be the smallest posi- 


tive integer such that 


5.11 vk € span(v1,...,Ve—1)3 

the existence of k with this property follows from the Linear Dependence 
Lemma (2.21). Thus there exist a1, ...,aķ—1 € F such that 

5.12 vk = 41V1 +°: + 4k—1Vk-1.- 


Apply T to both sides of this equation, getting 
Àkvk = AAV +++ + ak—1Ak—-1Vk-1- 
Multiply both sides of 5.12 by A, and then subtract the equation above, getting 
0 = a1 (Ag — A1)vi ++ + ak- Ak — Àk-1)Vk-1- 


Because we chose k to be the smallest positive integer satisfying 5.11, 
V1,- .-, Vķ—1 is linearly independent. Thus the equation above implies that all 
the a’s are 0 (recall that A, is not equal to any of A,,..., Ag—1). However, this 
means that vz; equals 0 (see 5.12), contradicting our hypothesis that vz is an 
eigenvector. Therefore our assumption that v1, ... , Vm is linearly dependent 
was false. a 


The corollary below states that an operator cannot have more distinct 
eigenvalues than the dimension of the vector space on which it acts. 


5.13 Number of eigenvalues 
Suppose V is finite-dimensional. Then each operator on V has at most 


dim V distinct eigenvalues. 


Proof Let T € L(V). Suppose Aj,...,Am are distinct eigenvalues of T. 
Let v1,..., Vm be corresponding eigenvectors. Then 5.10 implies that the list 
V1,..-,Vm is linearly independent. Thus m < dim V (see 2.23), as desired. m 
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Restriction and Quotient Operators 


If T € L(V) and U is a subspace of V invariant under 7, then U determines 
two other operators T|y € £(U) and T/U e L(V/U) ina natural way, as 
defined below. 


5.14 Definition T|y andT/U 
Suppose T € L(V) and U is a subspace of V invariant under T. 


e The restriction operator T|\y € L(U) is defined by 
T\ly(u) = Tu 
foru € U. 
e The quotient operator T/U € L(V/U) is defined by 
UN e U = i U 


for v € V. 


For both the operators defined above, it is worthwhile to pay attention 
to their domains and to spend a moment thinking about why they are well 
defined as operators on their domains. First consider the restriction operator 
T|y € £(U), which is T with its domain restricted to U, thought of as 
mapping into U instead of into V. The condition that U is invariant under T 
is what allows us to think of T |y as an operator on U, meaning a linear map 
into the same space as the domain, rather than as simply a linear map from 
one vector space to another vector space. 

To show that the definition above of the quotient operator makes sense, 
we need to verify that if v + U = w + U, then Tv + U = Tw + U. Hence 
suppose v + U = w + U. Thus v—w E€ U (see 3.85). Because U is invariant 
under T, we also have T (v — w) € U, which implies that Tv — Tw € U, which 
implies that Tv + U = Tw + U, as desired. 

Suppose T is an operator on a finite-dimensional vector space V and U is 
a subspace of V invariant under T, with U Æ {0} and U ¥ V. In some sense, 
we can learn about T by studying the operators T |y and T/U, each of which 
is an operator on a vector space with smaller dimension than V. For example, 
proof 2 of 5.27 makes nice use of T/U. 

However, sometimes T |y and T/U do not provide enough information 
about T. In the next example, both T|y and T/U are 0 even though T is not 
the 0 operator. 


138 


CHAPTER 5 Eigenvalues, Eigenvectors, and Invariant Subspaces 


5.15 Example Define an operator T € L(F?) by T(x, y) = (y,0). Let 
U = {(x,0) : x € F}. Show that 


(a) U is invariant under T and T |y is the 0 operator on U; 

(b) there does not exist a subspace W of F? that is invariant under T and 
such that F? = U @ W; 

(c) T/U is the 0 operator on F? / U. 

Solution 

(a) For (x,0) € U, we have T(x,0) = (0,0) € U. Thus U is invariant 
under T and T |y is the 0 operator on U. 

(b) Suppose W is a subspace of V such that F? = U ® W. Because 
dim F? = 2 and dim U = 1, we have dim W = 1. If W were invariant 
under T, then each nonzero vector in W would be an eigenvector of T. 
However, it is easy to see that 0 is the only eigenvalue of T and that all 
eigenvectors of T are in U. Thus W is not invariant under T. 

(c) For (x, y) € F?, we have 


(T/U)((x,y) +U) =T(x,y)+U 
= (y,0)+ U 
=0+ U, 


where the last equality holds because (y, 0) € U. The equation above 
shows that T/U is the 0 operator. 


EXERCISES 5.A 


1 


Suppose T € £(V) and U is a subspace of V. 
(a) Prove that if U C null T, then U is invariant under T. 


(b) Prove that if range T C U, then U is invariant under T. 


Suppose S$,T € L(V) are such that ST = TS. Prove that null S is 
invariant under T. 


10 


11 


12 


13 
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Suppose S,T € L(V) are such that ST = TS. Prove that range S is 
invariant under T. 


Suppose that T € L(V) and U,,..., Um are subspaces of V invariant 
under T. Prove that U; + --- + Um is invariant under T. 


Suppose T € L(V). Prove that the intersection of every collection of 
subspaces of V invariant under T is invariant under T. 


Prove or give a counterexample: if V is finite-dimensional and U is a 
subspace of V that is invariant under every operator on V, then U = {0} 
or U = V. 


Suppose T € £L(R?) is defined by T(x, y) = (—3y, x). Find the 
eigenvalues of T. 


Define T € L(F?) by 
T(w,z) = (z,w). 


Find all eigenvalues and eigenvectors of T. 
Define T € L(F3) by 
T (21, 22,23) = (222, 0, 5z3). 
Find all eigenvalues and eigenvectors of T. 
Define T € L(F”) by 
T (X41, X2,X3,...,; Xn) = (x1, 2X2, 3x3,...,NXy). 


(a) Find all eigenvalues and eigenvectors of T. 


(b) Find all invariant subspaces of T. 


Define T: P(R) — P(R) by Tp = p’. Find all eigenvalues and 
eigenvectors of T. 


Define T € L(P4(R)) by 


(Tp)(x) = xp'(x) 
for all x € R. Find all eigenvalues and eigenvectors of T. 


Suppose V is finite-dimensional, T € L(V), and À € F. Prove that there 


exists œ € F such that |æ — À| < m and T — a/ is invertible. 


140 


14 


15 


16 


17 


18 


19 


20 


21 


CHAPTER 5 Eigenvalues, Eigenvectors, and Invariant Subspaces 


Suppose V = U @ W, where U and W are nonzero subspaces of V. 
Define P € L(V) by P(u +w) = u foru € U and w € W. Find all 
eigenvalues and eigenvectors of P. 


Suppose T € L(V). Suppose S$ € L(V) is invertible. 


(a) Prove that T and S~'T'S have the same eigenvalues. 


(b) | What is the relationship between the eigenvectors of T and the 
eigenvectors of S~!TS? 


Suppose V is a complex vector space, T € L(V), and the matrix of T 
with respect to some basis of V contains only real entries. Show that if 
À is an eigenvalue of T, then so is À. 


Give an example of an operator T € £(R*) such that T has no (real) 
eigenvalues. 


Show that the operator T € £(C°) defined by 
T(Z1,2Z2,...) = (0, 21,Z2,...) 
has no eigenvalues. 


Suppose n is a positive integer and T € £L(F”) is defined by 
T(x1,..., Xn) = (x1 tees + Xn,- X1 H + Xn); 


in other words, T is the operator whose matrix (with respect to the 
standard basis) consists of all 1’s. Find all eigenvalues and eigenvectors 
of T. 


Find all eigenvalues and eigenvectors of the backward shift operator 
T € L(F®%) defined by 


T (21, 22, 23,...) = (Z2,Z3,...). 


Suppose T € L(V) is invertible. 


(a) Suppose A € F with A ¥ 0. Prove that À is an eigenvalue of T if 
and only if 4 is an eigenvalue of T7}. 


(b) Prove that T and TT! have the same eigenvectors. 
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Suppose T € L(V) and there exist nonzero vectors v and w in V such 
that 
Tv=3w and Tw = 3v. 


Prove that 3 or —3 is an eigenvalue of T. 


Suppose V is finite-dimensional and S$, T € L(V). Prove that ST and 
TS have the same eigenvalues. 


Suppose A is an n-by-n matrix with entries in F. Define T € L(F”) 
by Tx = Ax, where elements of F” are thought of as n-by-1 column 
vectors. 


(a) Suppose the sum of the entries in each row of A equals 1. Prove 
that 1 is an eigenvalue of T. 


(b) Suppose the sum of the entries in each column of A equals 1. 
Prove that 1 is an eigenvalue of T. 


Suppose T € L(V) and u,v are eigenvectors of T such that u + v 
is also an eigenvector of T. Prove that u and v are eigenvectors of T 
corresponding to the same eigenvalue. 


Suppose T € L(V) is such that every nonzero vector in V is an eigen- 
vector of T. Prove that T is a scalar multiple of the identity operator. 


Suppose V is finite-dimensional and T € L(V) is such that every sub- 
space of V with dimension dim V — 1 is invariant under T. Prove that T 
is a scalar multiple of the identity operator. 


Suppose V is finite-dimensional with dim V > 3 and T € L(V) is such 
that every 2-dimensional subspace of V is invariant under T. Prove that 
T is a scalar multiple of the identity operator. 


Suppose T € L(V) and dimrange T = k. Prove that T has at most 
k + 1 distinct eigenvalues. 


Suppose T € £(R?) and —4, 5, and v/T are eigenvalues of T. Prove that 
there exists x € R? such that Tx — 9x = (—4,5, A/T). 


Suppose V is finite-dimensional and v1, . . . , Vm is a list of vectors in V. 
Prove that v1,..., Vm is linearly independent if and only if there exists 
T € L(V) such that v1, ..., vm are eigenvectors of T corresponding to 
distinct eigenvalues. 
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Suppose A1,...,A, is a list of distinct real numbers. Prove that the list 
BOF cas etn is linearly independent in the vector space of real-valued 
functions on R. 


Hint: Let V = span(e*!* ee ern), and define an operator T € L(V) 
by Tf = f’. Find eigenvalues and eigenvectors of T. 


Suppose T € L(V). Prove that T/(range T) = 0. 


Suppose T € L(V). Prove that T/(null T) is injective if and only if 
(null T) N (range T) = {0}. 


Suppose V is finite-dimensional, T € £(V), and U is invariant under T. 
Prove that each eigenvalue of T/U is an eigenvalue of T. 

[The exercise below asks you to verify that the hypothesis that V is 
finite-dimensional is needed for the exercise above. ] 


Give an example of a vector space V, an operator T € L(V), and 
a subspace U of V that is invariant under T such that T/U has an 
eigenvalue that is not an eigenvalue of T. 
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5.B  Ejigenvectors and Upper-Triangular 
Matrices 


Polynomials Applied to Operators 


The main reason that a richer theory exists for operators (which map a vector 
space into itself) than for more general linear maps is that operators can be 
raised to powers. We begin this section by defining that notion and the key 
concept of applying a polynomial to an operator. 
If T € L(V), then TT makes sense and is also in L(V). We usually write 
T? instead of T T. More generally, we have the following definition. 
5.16 Definition T” 


Suppose T € L(V) and m is a positive integer. 


e T” is defined by 


mE — T as IE a 
EnA 
m times 


e T° is defined to be the identity operator J on V. 
e If T is invertible with inverse T~', then T~™ is defined by 
Tran ty. 
You should verify that if T is an operator, then 
T”T” = T”? and (Ty = T™””, 
where m and n are allowed to be arbitrary integers if T is invertible and 
nonnegative integers if T is not invertible. 
5.17 Definition p(T) 
Suppose T € L(V) and p € P(F) is a polynomial given by 
Pp) = to Tanzan +--+ + a_z™ 
for z € F. Then p(T) is the operator defined by 
DOP = pep 3 GP Ea os oT 


This is a new use of the symbol p because we are applying it to operators, 
not just elements of F. 
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5.18 Example Suppose D € L(P(R)) is the differentiation operator 
defined by Dq = q’ and p is the polynomial defined by p(x) = 7— 3x +5x?. 
Then p(D) = 71 —3D + 5D?; thus 


(p(D))q = 7q — 3q' + 5q” 
for every q € P(R). 


If we fix an operator T € L(V), then the function from P(F) to L(V) 
given by p + p(T) is linear, as you should verify. 


5.19 Definition product of polynomials 
If p,q € P(F), then pq € P(F) is the polynomial defined by 
(pq)(z) = p(z)q(2) 


forz € F. 


Any two polynomials of an operator commute, as shown below. 


5.20 Multiplicative properties 
Suppose p,q € P(F) and T € L(V). 


Part (a) holds because when ex-\ 
panding a product of polynomials 


Then using the distributive property, it 
(a) (pq)(T) = P(T)q(T); does not matter whether the sym- 
bol is z or T. 
b) paT) = q(T) p(7). === 
Proof 


(a) Suppose p(z) = rg ajz/ and q(z) = YR_o bkz“ for z € F. 


Then 
m n : 
(pq)(z) = D> J ars, 
jJ=0k=0 
Thus 


CVD] >) ane 


J=0k=0 

= (> amS bk T*) 
j=0 k=0 

= p(T )q(T). 


(b) Part (a) implies p(T)q(T) = (p4)(T) = (qp)\(T) =4(T)p(T). m 
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Existence of Eigenvalues 


Now we come to one of the central results about operators on complex vector 
spaces. 


5.21 Operators on complex vector spaces have an eigenvalue 


Every operator on a finite-dimensional, nonzero, complex vector space 
has an eigenvalue. 


Proof Suppose V is a complex vector space with dimension n > 0 and 
T € L(V). Choose v € V with v Æ 0. Then 


v, Tv, T?v,..., T”v 


is not linearly independent, because V has dimension n and we have n + 1 
vectors. Thus there exist complex numbers do,..., an, not all 0, such that 


0 = aov +aıTv + --- + anT”v. 


Note that a1,...,@, cannot all be 0, because otherwise the equation above 
would become 0 = aov, which would force dg also to be 0. 

Make the a’s the coefficients of a polynomial, which by the Fundamental 
Theorem of Algebra (4.14) has a factorization 


ao +41Z +: + anz” = c(z — 1) (Z — Àm), 


where c is a nonzero complex number, each A ; is in C, and the equation holds 
for all z € C (here m is not necessarily equal to n, because a, may equal 0). 
We then have 


0 = aov +aıTv +--+ anT”v 
= (aol +aıT +--+ anT”)v 
= c(T — 11) --- (T — àmI v. 


Thus T — À}; J is not injective for at least one j. In other words, T has an 
eigenvalue. n 


The proof above depends on the Fundamental Theorem of Algebra, which 
is typical of proofs of this result. See Exercises 16 and 17 for possible ways to 
rewrite the proof above using the idea of the proof in a slightly different form. 
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Upper-Triangular Matrices 


In Chapter 3 we discussed the matrix of a linear map from one vector space 
to another vector space. That matrix depended on a choice of a basis of each 
of the two vector spaces. Now that we are studying operators, which map a 
vector space to itself, the emphasis is on using only one basis. 


5.22 Definition matrix of an operator, M(T) 


Suppose T € L(V) and 1,..., vn is a basis of V. The matrix of T with 
respect to this basis is the n-by-n matrix 


Aid acd Ain 
M(T) = : 
ey Ook es 


J 


whose entries A ;,¢ are defined by 
Tv, = A1 kY1 aroraa An,kYn- 


If the basis is not clear from the context, then the notation 
M(T, (¥4,... te) is used. 


Note that the matrices of operators are square arrays, rather than the more 
general rectangular arrays that we considered earlier for linear maps. 
` n 
The k* column of the matrix If T 1$ an operator on F” and ” 
M(T) is formed from the coeffi- basis is specified, assume that the basis 
cients used to write T vy, as a linear in question is the standard one (where 
combination of V1,...,Vn. the j" basis vector is 1 in the j" slot 
and 0 in all the other slots). You can 
then think of the j" column of M(T) as T applied to the j™ basis vector. 


5.23 Example Define T € L(F?) by T(x, y,z) = (2x +y, 5y +3z, 8z). 
Then 


o w O 


2 1 
M(T)=| 0 5 
0 0 


A central goal of linear algebra is to show that given an operator T € L(V), 
there exists a basis of V with respect to which T has a reasonably simple 
matrix. To make this vague formulation a bit more precise, we might try to 
choose a basis of V such that M(T) has many 0’s. 
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If V is a finite-dimensional complex vector space, then we already know 
enough to show that there is a basis of V with respect to which the matrix of 
T has 0’s everywhere in the first column, except possibly the first entry. In 
other words, there is a basis of V with respect to which the matrix of T looks 
like 

À 
0 * 


0 
here the x denotes the entries in all the columns other than the first column. 
To prove this, let A be an eigenvalue of T (one exists by 5.21) and let v be a 
corresponding eigenvector. Extend v to a basis of V. Then the matrix of T 
with respect to this basis has the form above. 
Soon we will see that we can choose a basis of V with respect to which 
the matrix of T has even more 0’s. 


5.24 Definition diagonal of a matrix 


The diagonal of a square matrix consists of the entries along the line from 
the upper left corner to the bottom right corner. 


For example, the diagonal of the matrix in 5.23 consists of the entries 
2,5, 8. 


5.25 Definition upper-triangular matrix 


A matrix is called upper triangular if all the entries below the diagonal 
equal 0. 


For example, the matrix in 5.23 is upper triangular. 
Typically we represent an upper-triangular matrix in the form 


A x 

0 Àn 
the 0 in the matrix above indicates [wọ often use * to denote matrix en- 
that all entries below the diagonal in | tries that we do not know about or 
this n-by-n matrix equal 0. Upper- | that are irrelevant to the questions 
triangular matrices can be considered | being discussed. | 
reasonably simple—for n large, almost ~~ o 


half its entries in an n-by-n upper- 
triangular matrix are 0. 
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The following proposition demonstrates a useful connection between 
upper-triangular matrices and invariant subspaces. 
5.26 Conditions for upper-triangular matrix 


Suppose T € L(V) and v1,..., vn is a basis of V. Then the following are 
equivalent: 


(a) the matrix of T with respect to v1, ...,Vn is upper triangular; 

(b) Tv; €span(vi,...,v;) foreach j —1,...,n; 

(c)  span(vy,...,vj;) is invariant under T for each j = 1,...,n. 
Proof The equivalence of (a) and (b) follows easily from the definitions and 
a moment’s thought. Obviously (c) implies (b). Hence to complete the proof, 


we need only prove that (b) implies (c). 
Thus suppose (b) holds. Fix j € {1,...,7}. From (b), we know that 


Tv, € span(vı) C span(vı,..., vj); 
Tv2 € span(vı, v2) C span(vı,..., vj); 
Tv; € span(v1,...,v;). 
Thus if v is a linear combination of v;,...,v,;, then 
Tv € span(vy,...,v;). 

In other words, span(v;,...,v;) is invariant under T, completing the proof. m 
The next result does not hold on Now we can prove that for each 
real vector spaces, because the first operator on a finite-dimensional com- 
vector in a basis with respect to plex vector space, there is a basis of the 
which an operator has an upper- vector space with respect to which the 
triangular matrix is an eigenvector matrix of the operator has only 0’s be- 


of the operator. Thus if an opera- 
tor on a real vector space has no 


. improve even this result. 
eigenvalues [see 5.8(a) for an ex- proye ev a 
ample], then there is no basis with Sometimes more insight comes from 


respect to which the operator has seeing more than one proof of a theo- 
an upper-triangular matrix. rem. Thus two proofs are presented of 
the next result. Use whichever appeals 
more to you. 


low the diagonal. In Chapter 8 we will 
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5.27 Over C, every operator has an upper-triangular matrix 


Suppose V is a finite-dimensional complex vector space and T € L(V). 
Then 7 has an upper-triangular matrix with respect to some basis of V. 


Proof 1 We will use induction on the dimension of V. Clearly the desired 
result holds if dim V = 1. 

Suppose now that dim V > 1 and the desired result holds for all complex 
vector spaces whose dimension is less than the dimension of V. Let À be any 
eigenvalue of T (5.21 guarantees that T has an eigenvalue). Let 


U =range(T — AJ). 


Because T — AJ is not surjective (see 3.69), dim U < dim V. Furthermore, 
U is invariant under T. To prove this, suppose u € U. Then 


Obviously (T — AI )u € U (because U equals the range of T — AJ) and 
Au € U. Thus the equation above shows that Tu € U. Hence U is invariant 
under T, as claimed. 

Thus T|y is an operator on U. By our induction hypothesis, there is a 
basis u1,..., um of U with respect to which 7|y has an upper-triangular 
matrix. Thus for each j we have (using 5.26) 


5.28 Tu; = (T|y)(uj) € span(w,...,u;). 
Extend u1,..., Um to a basis u1, ..., um, V1, ...,Vn Of V. For each k, we 
have 


Tvg = (T — ÀI )vk + Avg. 
The definition of U shows that (T —AJ)v, € U = span(u1,..., um). Thus 
the equation above shows that 


5.29 Tvę € span(u1,..., Um, V1,- -> vg). 


From 5.28 and 5.29, we conclude (using 5.26) that T has an upper- 
triangular matrix with respect to the basis u1,...,uUm,V1,...,Vn Of V, as 
desired. a 
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Proof2 We will use induction on the dimension of V. Clearly the desired 
result holds if dim V = 1. 

Suppose now that dim V = n > 1 and the desired result holds for all 
complex vector spaces whose dimension is n — 1. Let vı be any eigenvector 
of T (5.21 guarantees that T has an eigenvector). Let U = span(v,). Then U 
is an invariant subspace of T and dim U = 1. 

Because dim V/U = n — 1 (see 3.89), we can apply our induction hy- 
pothesis to T/U € L(V/U). Thus there is a basis v2 + U,...,vn + U of 
V/U such that T/U has an upper-triangular matrix with respect to this basis. 
Hence by 5.26, 


(T/U)(v; +U) € spanQ2 + U,...,v; +U) 


for each j = 2,...,n. Unraveling the meaning of the inclusion above, we 
see that 
Tv; € span(vı,..., vj) 


for each j = 1,...,n. Thus by 5.26, T has an upper-triangular matrix 


with respect to the basis v1, ..., vn of V, as desired (it is easy to verify that 
V1, ..., Vn is a basis of V; see Exercise 13 in Section 3.E for a more general 
result). E 


How does one determine from looking at the matrix of an operator whether 
the operator is invertible? If we are fortunate enough to have a basis with 
respect to which the matrix of the operator is upper triangular, then this 
problem becomes easy, as the following proposition shows. 


5.30 Determination of invertibility from upper-triangular matrix 


Suppose T € £(V) has an upper-triangular matrix with respect to some 
basis of V. Then T is invertible if and only if all the entries on the diagonal 
of that upper-triangular matrix are nonzero. 


Proof Suppose v1,...,Vn is a basis of V with respect to which T has an 
upper-triangular matrix 


Ay x 
5.31 M(T) = 
0 Àn 


We need to prove that T is invertible if and only if all the À j’s are nonzero. 


SECTION 5.B Eigenvectors and Upper-Triangular Matrices 151 


First suppose the diagonal entries 41,...,A, are all nonzero. The upper- 
triangular matrix in 5.31 implies that Tv; = 1v1. Because A; 4 0, we have 
T(¥1/A1) = v1; thus vı € range T. 

Now 

T(v2/A2) = avı + v2 


for some a € F. The left side of the equation above and avı are both in 
range T; thus v2 € range T. 
Similarly, we see that 


T(v3/A3) = bvı + cv2 + v3 


for some b,c € F. The left side of the equation above and bv, cv2 are all in 
range T; thus v3 € range T. 

Continuing in this fashion, we conclude that v1,...,vn € range T. Be- 
cause v1, ..., Vn is a basis of V, this implies that range T = V. In other words, 
T is surjective. Hence T is invertible (by 3.69), as desired. 

To prove the other direction, now suppose that T is invertible. This implies 
that A; Æ 0, because otherwise we would have Tv, = 0. 

Let 1 < j < n, and suppose A; = 0. Then 5.31 implies that T maps 


span(vı,..., vj) into span(vj,...,vj;—-1). Because 
dim span(v1,...,vj) = j and dimspan(v,...,vj-1) = j —1, 
this implies that T restricted to dim span(v1,...,v;) is not injective (by 3.23). 


Thus there exists v € span(v1, . . . , vj) such that v # 0 and Tv = 0. Thus T 
is not injective, which contradicts our hypothesis (for this direction) that T is 
invertible. This contradiction means that our assumption that A ; = 0 must be 
false. Hence A; 0, as desired. E 


As an example of the result above, we see that the operator in Example 5.23 
is invertible. 

Unfortunately no method exists for Powerful numeric techniques exist 
exactly computing the eigenvalues of | for finding good approximations to 
an operator from its matrix. However, | the eigenvalues of an operator from | 
if we are fortunate enough to find a ba- its matr: ix. E E E 
sis with respect to which the matrix of 
the operator is upper triangular, then the 
problem of computing the eigenvalues 
becomes trivial, as the following propo- 
sition shows. 


j 


152 CHAPTER 5 Eigenvalues, Eigenvectors, and Invariant Subspaces 


5.32 Determination of eigenvalues from upper-triangular matrix 


Suppose T € £(V) has an upper-triangular matrix with respect to some 
basis of V. Then the eigenvalues of T are precisely the entries on the 
diagonal of that upper-triangular matrix. 


Proof Suppose vj,...,V, is a basis of V with respect to which T has an 
upper-triangular matrix 


* 


Ay 


M(T) = 
Let A € F. Then 


M(T —Al) = 
0 Ay À 


Hence T — AJ is not invertible if and only if A equals one of the numbers 
À1,..., Àn (by 5.30). Thus À is an eigenvalue of T if and only if A equals one 
of the numbers À1,..., Àn. a 


5.33 Example Define T € L(F?) by T(x, y, z) = (2x + y, 5y + 3z, 82). 
What are the eigenvalues of T? 


Solution The matrix of T with respect to the standard basis is 
2 10 
M(T)= {1 0 5 3 
0 0 8 


Thus M(T) is an upper-triangular matrix. Now 5.32 implies that the eigen- 
values of T are 2, 5, and 8. 


Once the eigenvalues of an operator on F” are known, the eigenvectors 
can be found easily using Gaussian elimination. 
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EXERCISES 5.B 


1 


10 


11 


12 


Suppose T € £(V) and there exists a positive integer n such that T” = 0. 


(a) Prove that J — T is invertible and that 
G=Ty" Sle aeons, 
(b) Explain how you would guess the formula above. 


Suppose T € L(V) and (T — 27)(T — 31)(T — 4/7) = 0. Suppose A is 
an eigenvalue of T. Prove that à = 20rA = 3 or À = 4. 


Suppose T € L(V) and T? = J and —1 is not an eigenvalue of T. Prove 
that T = 7. 


Suppose P € L(V) and P? = P. Prove that V = null P @ range P. 


Suppose S,7T € L(V) and S is invertible. Suppose p € P(F) is a 
polynomial. Prove that 


p(STS') = Sp(T) S7}. 
Suppose T € L(V) and U is a subspace of V invariant under T. Prove 
that U is invariant under p(T) for every polynomial p € P(F). 


Suppose T € L(V). Prove that 9 is an eigenvalue of T? if and only if 3 
or —3 is an eigenvalue of T. 


Give an example of T € £(R7) such that T4 = —1. 


Suppose V is finite-dimensional, T € L(V), and v € V with v Æ 0. 
Let p be a nonzero polynomial of smallest degree such that p(T)v = 0. 
Prove that every zero of p is an eigenvalue of T. 


Suppose T € L(V) and v is an eigenvector of T with eigenvalue À. 
Suppose p € P(F). Prove that p(T )v = p(A)v. 


Suppose F = C, T € L(V), p € P(C) is a polynomial, and œ € C. 
Prove that œ is an eigenvalue of p(T) if and only if a = p(A) for some 
eigenvalue A of T. 


Show that the result in the previous exercise does not hold if C is replaced 
with R. 
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13 


14 


15 


16 


17 


18 


19 


20 
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Suppose W is a complex vector space and T € £L(W) has no eigenvalues. 
Prove that every subspace of W invariant under T is either {0} or infinite- 
dimensional. 


Give an example of an operator whose matrix with respect to some basis 
contains only 0’s on the diagonal, but the operator is invertible. 

[The exercise above and the exercise below show that 5.30 fails without 
the hypothesis that an upper-triangular matrix is under consideration. | 


Give an example of an operator whose matrix with respect to some basis 
contains only nonzero numbers on the diagonal, but the operator is not 
invertible. 


Rewrite the proof of 5.21 using the linear map that sends p € Pa (C) to 
(p(T))v € V (and use 3.23). 


Rewrite the proof of 5.21 using the linear map that sends p € P,,2(C) to 
p(T) € L(V) (and use 3.23). 


Suppose V is a finite-dimensional complex vector space and T € L(V). 
Define a function f : C > R by 


f(A) = dimrange(T — AJ). 
Prove that f is not a continuous function. 


Suppose V is finite-dimensional with dim V > 1 and T € L(V). Prove 
that 


(p(T): p € PŒ); FLV). 


Suppose V is a finite-dimensional complex vector space and T € L(V). 
Prove that T has an invariant subspace of dimension k for each k = 
1,...,dim V. 
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5.C Eigenspaces and Diagonal Matrices 
5.34 Definition diagonal matrix 


A diagonal matrix is a square matrix that is 0 everywhere except possibly 
along the diagonal. 


5.35 Example 


oo ew 
ONO 
noe 


is a diagonal matrix. 


Obviously every diagonal matrix is upper triangular. In general, a diagonal 
matrix has many more 0’s than an upper-triangular matrix. 

If an operator has a diagonal matrix with respect to some basis, then the 
entries along the diagonal are precisely the eigenvalues of the operator; this 
follows from 5.32 (or find an easier proof for diagonal matrices). 


5.36 Definition eigenspace, E(A,T) 


Suppose T € L(V) and A € F. The eigenspace of T corresponding to À, 
denoted E(A, T), is defined by 


E(A,T) = mll(T — ÀT). 


In other words, E (A, T) is the set of all eigenvectors of T corresponding 
to A, along with the 0 vector. 


For T € L(V) and A € F, the eigenspace E(A, T) is a subspace of V 
(because the null space of each linear map on V is a subspace of V). The 
definitions imply that A is an eigenvalue of T if and only if E(A, T) Æ {0}. 


5.37 Example Suppose the matrix of an operator T € L(V) with respect 
to a basis v1, v2, v3 of V is the matrix in Example 5.35 above. Then 
E(8,T) = span(vı), E(5,7) = span(v2, v3). 


If A is an eigenvalue of an operator T € L(V), then T restricted to 
E(A, T) is just the operator of multiplication by À. 
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5.38 Sum of eigenspaces is a direct sum 


Suppose V is finite-dimensional and T € L(V). Suppose also that 
A1,...,4m are distinct eigenvalues of T. Then 


E(à1, T) SP a + E(Am, T) 
is a direct sum. Furthermore, 
dim E (41, T) +---+dim EQ», T) < dim V. 


Proof To show that E (1, T) +---+ E(Am, T) is a direct sum, suppose 
ui +: + Uum =Q, 


where each u; is in E(A, T). Because eigenvectors corresponding to distinct 
eigenvalues are linearly independent (see 5.10), this implies that each u ; 
equals 0. This implies (using 1.44) that E (1, T) +---+ E(Am, T) is a direct 
sum, as desired. 

Now 
dim E(A;,T) +--+ + dim E(m,T) = dim(E (41, T) @--- ® E(Am.T)) 


< dim V, 


where the equality above follows from Exercise 16 in Section 2.C. E 


5.39 Definition diagonalizable 


An operator T € L(V) is called diagonalizable if the operator has a 
diagonal matrix with respect to some basis of V. 


5.40 Example Define T € L(R?) by 
T(x, y) = (41x + Ty, —20x + 74y). 
The matrix of T with respect to the standard basis of R? is 
41 7 
(o a) 
which is not a diagonal matrix. However, T is diagonalizable, because the 
matrix of T with respect to the basis (1, 4), (7, 5) is 


69 0 
0 46)’ 


as you should verify. 
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5.41 Conditions equivalent to diagonalizability 


Suppose V is finite-dimensional and T € L(V). Let A1,..., Am denote 
the distinct eigenvalues of T. Then the following are equivalent: 


(a) T is diagonalizable; 
(b) V has a basis consisting of eigenvectors of T; 


(c) there exist 1-dimensional subspaces U,,..., U, of V, each invariant 
under T, such that 


V = Ui @®---@® Un; 


(d) V=£(\1,T)@---® E(Àm,T); 
(©) dimV = dim E(å1, T) +--- + dim E (Àm, T). 


Proof An operator T € L(V) has a diagonal matrix 


Ài 0 

0 Àn 
with respect to a basis v1, .. . , Vn of V if and only if Tv; = A;v; for each j. 
Thus (a) and (b) are equivalent. 

Suppose (b) holds; thus V has a basis v1, . . . , Vn consisting of eigenvectors 
of T. For each j, let U; = span(v;). Obviously each U; is a 1-dimensional 
subspace of V that is invariant under T. Because v1,..., Vv, is a basis of V, 
each vector in V can be written uniquely as a linear combination of v1, .. . , Vn. 


In other words, each vector in V can be written uniquely as a sum u1 +: ++un, 
where each u; isin U;. Thus V = U1 © --- ® Un. Hence (b) implies (c). 

Suppose now that (c) holds; thus there are 1-dimensional subspaces 
U,,...,U, of V, each invariant under T, such that V = Ui ® ---® Un. 
For each j, let v; be a nonzero vector in U;. Then each v; is an eigenvector 
of T. Because each vector in V can be written uniquely as a sum u4+---+Un, 
where each u; is in U; (so each u; is a scalar multiple of v;), we see that 
V1, ..., Vn is a basis of V. Thus (c) implies (b). 

At this stage of the proof we know that (a), (b), and (c) are all equivalent. 
We will finish the proof by showing that (b) implies (d), that (d) implies (e), 
and that (e) implies (b). 
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Suppose (b) holds; thus V has a basis consisting of eigenvectors of T. 
Hence every vector in V is a linear combination of eigenvectors of T, which 
implies that 

V = E(A1,T) +++: + EQm,T). 
Now 5.38 shows that (d) holds. 


That (d) implies (e) follows immediately from Exercise 16 in Section 2.C. 
Finally, suppose (e) holds; thus 


5.42 dim V = dim E (41, T) +--+ + dim E (Àm, T). 


Choose a basis of each E(A;, T); put all these bases together to form a list 
V1, ..., Vn Of eigenvectors of T, where n = dim V (by 5.42). To show that 
this list is linearly independent, suppose 


avı ++- + anvn = 0, 


where a1,...,an € F. For each j = 1,...,m, let u; denote the sum of all 
the terms agvg such that vg € E(À;, T). Thus each u; isin E(A;, T), and 


ui + +Um = 0. 


Because eigenvectors corresponding to distinct eigenvalues are linearly inde- 
pendent (see 5.10), this implies that each u ; equals 0. Because each u j is a 
sum of terms agvk, Where the vz’s were chosen to be a basis of E(A;, T), this 
implies that all the ag’s equal 0. Thus vj,..., vn is linearly independent and 
hence is a basis of V (by 2.39). Thus (e) implies (b), completing the proof. m 


Unfortunately not every operator is diagonalizable. This sad state of affairs 
can arise even on complex vector spaces, as shown by the next example. 


5.43 Example Show that the operator T € £(C7) defined by 
T(w,z) = (z,0) 
is not diagonalizable. 


Solution As you should verify, 0 is the only eigenvalue of T and furthermore 
E(0,T) = {(w,0) € C? : we Ch. 

Thus conditions (b), (c), (d), and (e) of 5.41 are easily seen to fail (of 
course, because these conditions are equivalent, it is only necessary to check 
that one of them fails). Thus condition (a) of 5.41 also fails, and hence T is 
not diagonalizable. 
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The next result shows that if an operator has as many distinct eigenvalues 
as the dimension of its domain, then the operator is diagonalizable. 


5.44 Enough eigenvalues implies diagonalizability 


If T € L(V) has dim V distinct eigenvalues, then T is diagonalizable. 


Proof Suppose T € L(V) has dim V distinct eigenvalues 11,..., Adim V - 
For each j, let v; € V be an eigenvector corresponding to the eigenvalue A ;. 
Because eigenvectors corresponding to distinct eigenvalues are linearly inde- 
pendent (see 5.10), v1,..., Vaim y İS linearly independent. A linearly indepen- 
dent list of dim V vectors in V is a basis of V (see 2.39); thus v1,..., Vdim V 
is a basis of V. With respect to this basis consisting of eigenvectors, T has a 
diagonal matrix. n 


5.45 Example Define T € L(F?) by T(x, y, z) = (2x + y, 5y + 3z, 82). 
Find a basis of F? with respect to which T has a diagonal matrix. 


Solution With respect to the standard basis, the matrix of T is 


2 10 
0 5 3 
0 0 8 


The matrix above is upper triangular. Thus by 5.32, the eigenvalues of T are 
2, 5, and 8. Because T is an operator on a vector space with dimension 3 and 
T has three distinct eigenvalues, 5.44 assures us that there exists a basis of F? 
with respect to which T has a diagonal matrix. 

To find this basis, we only have to find an eigenvector for each eigenvalue. 
In other words, we have to find a nonzero solution to the equation 


T(x, y,z) = A(x, y, Z) 


for A = 2, then for A = 5, and then for A = 8. These simple equations are 
easy to solve: for A = 2 we have the eigenvector (1, 0,0); for A = 5 we have 
the eigenvector (1, 3,0); for A = 8 we have the eigenvector (1, 6, 6). 

Thus (1,0, 0), (1, 3, 0), (1, 6, 6) is a basis of F*, and with respect to this 
basis the matrix of T is 


0 0 
5 0 
0 8 


oon 
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The converse of 5.44 is not true. For example, the operator T defined on 
the three-dimensional space F? by 


T (21, 23, Z3) — (421, 4z>, 523) 


has only two eigenvalues (4 and 5), but this operator has a diagonal matrix 
with respect to the standard basis. 

In later chapters we will find additional conditions that imply that certain 
operators are diagonalizable. 


EXERCISES 5.C 


1 Suppose T € L(V) is diagonalizable. Prove that V = null T @ range T. 


2 Prove the converse of the statement in the exercise above or give a 
counterexample to the converse. 


3 Suppose V is finite-dimensional and T € L(V). Prove that the following 
are equivalent: 
(a) V =nullT @rangeT. 
(b) V =nullT + range T. 
(c) nullT N rangeT = {0}. 
4 Give an example to show that the exercise above is false without the 


hypothesis that V is finite-dimensional. 


5 Suppose V is a finite-dimensional complex vector space and T € L(V). 
Prove that T is diagonalizable if and only if 


V = null(T — AJ) @ range(T — AJ) 
for every A € C. 


6 Suppose V is finite-dimensional, T € L(V) has dim V distinct eigenval- 
ues, and S € L(V) has the same eigenvectors as T (not necessarily with 
the same eigenvalues). Prove that ST = TS. 


7 Suppose T € L(V) has a diagonal matrix A with respect to some basis 
of V and that A € F. Prove that À appears on the diagonal of A precisely 
dim E'(A, T) times. 


8 Suppose T € £(F>) and dim E(8, T) = 4. Prove that T —2/ or T — 61I 
is invertible. 


10 


11 
12 


13 


14 


15 


16 
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Suppose T € L(V) is invertible. Prove that F(A, T) = E(j. TT!) for 
every A € F with à Æ 0. 


Suppose that V is finite-dimensional and T € L(V). Let A1,...,Am 
denote the distinct nonzero eigenvalues of T. Prove that 


dim E(A1, T) +---+ dim E(,,, T) < dimrange T. 
Verify the assertion in Example 5.40. 


Suppose R, T € L(F) each have 2, 6, 7 as eigenvalues. Prove that there 
exists an invertible operator S € L(F?) such that R = STETS. 


Find R,T € L(F*) such that R and T each have 2, 6, 7 as eigenvalues, 
R and T have no other eigenvalues, and there does not exist an invertible 
operator S € £(F*) such that R = S~!TS. 


Find T € £(C3) such that 6 and 7 are eigenvalues of T and such that T 
does not have a diagonal matrix with respect to any basis of C?. 


Suppose T € £(C?) is such that 6 and 7 are eigenvalues of T. Fur- 
thermore, suppose T does not have a diagonal matrix with respect 
to any basis of C?. Prove that there exists (x, y,z) € F? such that 
T (x,y,z) = (17 + 8x, V5 + 8y, 27 + 82). 


The Fibonacci sequence F , F2,... is defined by 
Fi, =1, Fo=1, and F, = F,-2+ F,-1 forn > 3. 
Define T € £(R?) by T(x, y) = (y, x + y). 


(a) Show that T” (0,1) = (Fn, Fn+1) for each positive integer n. 
(b) Find the eigenvalues of T. 

(c) Find a basis of R? consisting of eigenvectors of T. 

(d) Use the solution to part (c) to compute 7” (0, 1). Conclude that 


n= S -Y 


for each positive integer n. 


(e) Use part (d) to conclude that for each positive integer n, the 
Fibonacci number F, is the integer that is closest to 


(sy 
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Inner Product Spaces 


In making the definition of a vector space, we generalized the linear structure 
(addition and scalar multiplication) of R? and R3. We ignored other important 
features, such as the notions of length and angle. These ideas are embedded 
in the concept we now investigate, inner products. 

Our standing assumptions are as follows: 


6.1 Notation F, V 


F denotes R or C. 


V denotes a vector space over F. 


LEARNING OBJECTIVES FOR THIS CHAPTER 


Cauchy—Schwarz Inequality 

m Gram-Schmidt Procedure 

m linear functionals on inner product spaces 

m calculating minimum distance to a subspace 
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6.A Inner Products and Norms 


Inner Products 


To motivate the concept of inner prod- 
@ X2) uct, think of vectors in R? and R? as 
x arrows with initial point at the origin. 
The length of a vector x in R? or R 
is called the norm of x, denoted ||x ||. 
Thus for x = (x1, x2) € R?, we have 
The length of this ee x is Ixl] = Vx12 + X22. 
V x1? + X27. Similarly, if x = (x1, x2, x3) € RÌ, 
then ||x|| = x12 + x22 + x32. 
Even though we cannot draw pictures in higher dimensions, the gener- 
alization to R” is obvious: we define the norm of x = (x1,..., Xn) € R” 
by 


Ix] = Vx1? +-+ + xn?. 


The norm is not linear on R”. To inject linearity into the discussion, we 
introduce the dot product. 


6.2 Definition dot product 
For x, y € R”, the dot product of x and y, denoted x - y, is defined by 


X- Y = X1y1 +++ + Xnyn, 


where x = (%1,...,%n) and y = (j1,..., yn). 


Note that the dot product of two vec- 
tors in R” is a number, not a vector. Ob- 
viously x» x = ||x||? for all x € R”. 
The dot product on R” has the follow- 
ing properties: 


If we think of vectors as points in- 
stead of arrows, then ||x|| should 
be interpreted as the distance from 
the origin to the point x. 


e x-x > O forall x € R”; 


e x-x = 0 if and only if x = 0; 


e for y € R” fixed, the map from R” to R that sends x € R” to x - y is 
linear; 


ex-y=y-xforallx,y € R”. 
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An inner product is a generalization of the dot product. At this point you 
may be tempted to guess that an inner product is defined by abstracting the 
properties of the dot product discussed in the last paragraph. For real vector 
spaces, that guess is correct. However, so that we can make a definition that 
will be useful for both real and complex vector spaces, we need to examine 
the complex case before making the definition. 

Recall that if A = a + bi, where a,b € R, then 


e the absolute value of A, denoted |À], is defined by |A| = va? + b?; 
e the complex conjugate of A, denoted i, is defined by A =a-bi; 
e |A]? = AA. 


See Chapter 4 for the definitions and the basic properties of the absolute value 
and complex conjugate. 
For z = (Z1,..., Zn) E€ C”, we define the norm of z by 


Izl = lal? +--+ ben. 


The absolute values are needed because we want ||z|| to be a nonnegative 
number. Note that 
lz? = z121 +--+ + ZnZn. 


We want to think of ||z||? as the inner product of z with itself, as we 
did in R”. The equation above thus suggests that the inner product of 
w = (W1,...,Wn) E€ C” with z should equal 


W1Zy +++: +WnZp. 


If the roles of the w and z were interchanged, the expression above would 
be replaced with its complex conjugate. In other words, we should expect 
that the inner product of w with z equals the complex conjugate of the inner 
product of z with w. With that motivation, we are now ready to define an 
inner product on V, which may be a real or a complex vector space. 

Two comments about the notation used in the next definition: 


e If A is a complex number, then the notation A > 0 means that À is real 
and nonnegative. 


e We use the common notation (u,v), with angle brackets denoting an 
inner product. Some people use parentheses instead, but then (u, v) 
becomes ambiguous because it could denote either an ordered pair or 
an inner product. 
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6.3 Definition inner product 
An inner product on V is a function that takes each ordered pair (u, v) of 
elements of V to a number (u, v) € F and has the following properties: 


positivity 
(v,v) > 0 for ally Ee V; 


definiteness 
(v, v) = 0 if and only if v = 0; 


additivity in first slot 
(u +v,w) = (u,w) + (v,w) for all u,v,w € V; 


homogeneity in first slot 
(Au,v) = à (u, v} for all A € F and all u,v € V; 


conjugate symmetry 
(u,v) = (v, u) for all u,v € V. 


Every real number equals its com- 


Although most mathematicians de- 
plex conjugate. Thus if we are dealing 


fine an inner product as above, 


many physicists use a definition with a real vector space, then in the last 
that requires homogeneity in the condition above we can dispense with 
second slot instead of the first slot. the complex conjugate and simply state 


that (u,v) = (v,u) forall v,w € V. 


6.4 Example inner products 


(a) The Euclidean inner product on F" is defined by 


((W1,.-.,Wn), (Z1; -<--> Zn)) = W121 +++ + WnZn. 
(b) Ifci,...,Cn are positive numbers, then an inner product can be defined 
on F” by 
((W1,..-,Wn), (Z1, ---, Zn)} = C1W1Z1 + +° + CnWnZn. 


(c) An inner product can be defined on the vector space of continuous 
real-valued functions on the interval [—1, 1] by 


1 
(fa) = L f(x)g(x) dx. 
(d) An inner product can be defined on P(R) by 
(p.q) = [ P(x)q(x)e™ dx. 
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6.5 Definition inner product space 


An inner product space is a vector space V along with an inner product 
on V. 


The most important example of an inner product space is F” with the 
Euclidean inner product given by part (a) of the last example. When F” is 
referred to as an inner product space, you should assume that the inner product 
is the Euclidean inner product unless explicitly told otherwise. 

So that we do not have to keep repeating the hypothesis that V is an inner 
product space, for the rest of this chapter we make the following assumption: 


6.6 Notation V 


For the rest of this chapter, V denotes an inner product space over F. 


Note the slight abuse of language here. An inner product space is a vector 
space along with an inner product on that vector space. When we say that 
a vector space V is an inner product space, we are also thinking that an 
inner product on V is lurking nearby or is obvious from the context (or is the 
Euclidean inner product if the vector space is F”). 


6.7 Basic properties of an inner product 


(a) For each fixed u € V, the function that takes v to (v, u} is a linear 
map from V to F. 


(b)  (0,u) = 0 for every u € V. 


(c)  (u,0) = 0 for every u € V. 

(d) (u,v + w) = (u,v) + (u,w) for all u,v, w € V. 

(e)  (u,Av) = Alu, v) for all A € F and u,v € V. 
Proof 


(a) Part (a) follows from the conditions of additivity in the first slot and 
homogeneity in the first slot in the definition of an inner product. 


(b) Part (b) follows from part (a) and the result that every linear map takes 
0 to 0. 
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(c) Part (c) follows from part (a) and the conjugate symmetry property in 
the definition of an inner product. 


(d) Suppose u,v,w € V. Then 


(e) Suppose A € F and u,v € V. Then 


(u, Av) = (Av, u) 


as desired. C] 


Norms 


Our motivation for defining inner products came initially from the norms of 
vectors on R? and R?. Now we see that each inner product determines a 
norm. 


6.8 Definition norm, ||v|| 


For v € V, the norm of v, denoted ||v||, is defined by 


IIvll = v v, v). 


6.9 Example norms 
(a)  If(Zz1,...,Zn) € F” (with the Euclidean inner product), then 


[Er -+270 = fla? +- + Izal. 


(b) Inthe vector space of continuous real-valued functions on [—1, 1] [with 
inner product given as in part (c) of 6.4], we have 


isi=\f (feo)? ae. 
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6.10 Basic properties of the norm 


Suppose v € V. 

(a) —_||v|| = 0 if and only if v = 0. 

(b) — |JAv|| = JA] Iivi] for all A € F. 
Proof 


(a) The desired result holds because (v, v} = 0 if and only if v = 0. 
(b) Suppose A € F. Then 


Av? = (Av, ay) 
= À(v, Av) 
= Av, v) 
= JA Iv’. 
Taking square roots now gives the desired equality. E 


The proof above of part (b) illustrates a general principle: working with 
norms squared is usually easier than working directly with norms. 
Now we come to a crucial definition. 


6.11 Definition orthogonal 


Two vectors u,v € V are called orthogonal if (u,v) = 0. 


In the definition above, the order of the vectors does not matter, because 
(u,v) = 0 if and only if (v,w) = 0. Instead of saying that u and v are 
orthogonal, sometimes we say that u is orthogonal to v. 

Exercise 13 asks you to prove that if u, v are nonzero vectors in R?, then 


(u,v) = [lulil] cos 8, 


where 6 is the angle between u and v (thinking of u and v as arrows with initial 
point at the origin). Thus two vectors in R? are orthogonal (with respect to the 
usual Euclidean inner product) if and only if the cosine of the angle between 
them is 0, which happens if and only if the vectors are perpendicular in the 
usual sense of plane geometry. Thus you can think of the word orthogonal as 
a fancy word meaning perpendicular. 
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We begin our study of orthogonality with an easy result. 


6.12 Orthogonality and 0 

(a) Ois orthogonal to every vector in V. 

(b) Ois the only vector in V that is orthogonal to itself. 
Proof 


(a) Part (b) of 6.7 states that (0, u) = 0 for every u € V. 


(b) Ifv ce V and (v, v) = 0, then v = 0 (by definition of inner product). m 


For the special case V = R?, the 
next theorem is over 2,500 years old. 
Of course, the proof below is not the 
original proof. 


The word orthogonal comes from 
the Greek word orthogonios, 
which means right-angled. 


6.13 Pythagorean Theorem 


Suppose u and v are orthogonal vectors in V. Then 
2 2 2 
lu + vl? = llull + [lvl 
Proof We have 


Ju + vl? = (u +v,u +v) 
= (u,u) + (u,v) + (v, u) + (v, v) 


2 2 
= Ju? + Ivl, 
as desired. E 
The proof given above of the Suppose u,v € V, with v # 0. We 
Pythagorean Theorem shows that would like to write u as a scalar multiple 
the conclusion holds if and only of v plus a vector w orthogonal to v, as 
if (u,v) + (v,u), which equals suggested in the next picture. 


2Re(u,v), is 0. Thus the converse 
of the Pythagorean Theorem holds 
in real inner product spaces. 
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cv 


0 


An orthogonal decomposition. 


To discover how to write u as a scalar multiple of v plus a vector orthogonal 
to v, let c € F denote a scalar. Then 


u = cv + (u — cy). 


Thus we need to choose c so that v is orthogonal to (u — cv). In other words, 
we want 
0 = (u — cv, v} = (u,v) —cllv||?. 


The equation above shows that we should choose c to be {u, v)/||v||?. Making 
this choice of c, we can write 


a (u,v) ( - Sy) 
“= MeT ET e 


As you should verify, the equation above writes u as a scalar multiple of v 
plus a vector orthogonal to v. In other words, we have proved the following 
result. 


6.14 An orthogonal decomposition 


(u,v) (u,v) 


ndw = u— 


v. Then 
I|v||? I|v||? 


Suppose u,v € V, with v Æ 0. Set c = 


(w,v) =O and u=cv+w. 


The orthogonal decomposition 6.14 [French mathematician Augustin- 
will be used in the proof of the Cauchy- | Louis Cauchy (1789-1857) proved 
Schwarz Inequality, which is our next | 6.17(a) in 1821. German mathe- 
result and is one of the most important | matician Hermann Schwarz (1843— 
inequalities in mathematics. 1921) proved 6.17(b) in 1886. 
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6.15 Cauchy—Schwarz Inequality 


Suppose u,v € V. Then 
[(u,v)] < [ull Iv. 


This inequality is an equality if and only if one of u, v is a scalar multiple 
of the other. 


Proof Ifv = 0, then both sides of the desired inequality equal 0. Thus we 
can assume that v Æ 0. Consider the orthogonal decomposition 


(u,v) 
= vV 
lvl? 


given by 6.14, where w is orthogonal to v. By the Pythagorean Theorem, 
(u, v) 
pi? = | 


lvl? 


_ luv)? 
lvl? 


(u,v)? 
lvl? 


Multiplying both sides of this inequality by ||v||? and then taking square roots 
gives the desired inequality. 

Looking at the proof in the paragraph above, note that the Cauchy—Schwarz 
Inequality is an equality if and only if 6.16 is an equality. Obviously this 
happens if and only if w = 0. But w = 0 if and only if u is a multiple of v 
(see 6.14). Thus the Cauchy—Schwarz Inequality is an equality if and only if 
u is a scalar multiple of v or v is a scalar multiple of u (or both; the phrasing 
has been chosen to cover cases in which either u or v equals 0). a 


+w 


2 
"| + Iwl? 


+ lwll? 


6.16 


6.17 Example examples of the Cauchy-Schwarz Inequality 


(a)  Ifxi,...,Xn, Y1,..., Yn E R, then 
[x1y1 H + Xa nl? < 1? H + nO H + Yn). 


(b) If f, g are continuous real-valued functions on [—1, 1], then 


| J f(x)g(x) axl <( [ KE dx)( [ (g(x))’ dx). 
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The next result, called the Triangle 
Inequality, has the geometric interpreta- 
tion that the length of each side of a tri- 
angle is less than the sum of the lengths u 
of the other two sides. 

Note that the Triangle Inequality im- 
plies that the shortest path between two 
points is a line segment. 


y 


6.18 Triangle Inequality 


Suppose u,v € V. Then 
lu + vl] < llul] + Iiv]. 


This inequality is an equality if and only if one of u, v is a nonnegative 
multiple of the other. 


Proof We have 


lu + vl]? = (u +v, u +v) 
) + (v, v) + (u,v) + (v, u) 
) + (v, v) + (u,v) + (u,v) 
= lul? + llvl? + 2Re(u, v) 
6.19 < |lull? + Ivl? + 2] (ue, v)| 
6.20 < lul? + [lvl]? + 2u Iv 
= (llull + Iwi)’, 
where 6.20 follows from the Cauchy—Schwarz Inequality (6.15). Taking 
square roots of both sides of the inequality above gives the desired inequality. 
The proof above shows that the Triangle Inequality is an equality if and 


only if we have equality in 6.19 and 6.20. Thus we have equality in the 
Triangle Inequality if and only if 


= | 
= (u,u) + 
= (u,u) + 


6.21 (u,v) = |lull |v}. 


If one of u,v is a nonnegative multiple of the other, then 6.21 holds, as 
you should verify. Conversely, suppose 6.21 holds. Then the condition for 
equality in the Cauchy—Schwarz Inequality (6.15) implies that one of u, v is a 
scalar multiple of the other. Clearly 6.21 forces the scalar in question to be 
nonnegative, as desired. m 
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The next result is called the parallelogram equality because of its geometric 
interpretation: in every parallelogram, the sum of the squares of the lengths 
of the diagonals equals the sum of the squares of the lengths of the four sides. 


u 


, 
pe s 


u 


The parallelogram equality. 


6.22 Parallelogram Equality 


Suppose u,v € V. Then 
lu + vl? + llu — vl? = 2u? + Ivl. 


Proof We have 
ju + vll? + llu — vl]? = (u +v, u +v) + (u-v, u-v) 
= jul? + lvl? + (u,v) + (v, u) 
+ llul? + Ivl? — (u,v) — (v, u) 
= (llul? + Ivi”), 


as desired. C] 


Law professor Richard Friedman presenting a case before the U.S. 
Supreme Court in 2010: 


Mr. Friedman: I think that issue is entirely orthogonal to the issue here 
because the Commonwealth is acknowledging— 

Chief Justice Roberts: Pm sorry. Entirely what? 

Mr. Friedman: Orthogonal. Right angle. Unrelated. Irrelevant. 

Chief Justice Roberts: Oh. 

Justice Scalia: What was that adjective? I liked that. 

Mr. Friedman: Orthogonal. 

Chief Justice Roberts: Orthogonal. 

Mr. Friedman: Right, right. 

Justice Scalia: Orthogonal, ooh. (Laughter. ) 

Justice Kennedy: I knew this case presented us a problem. (Laughter.) 
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EXERCISES 6.A 


10 


Show that the function that takes ((x1, x2), (1, y2)) € R? x R? to 
|x1y1| + |x2y2| is not an inner product on R?. 


Show that the function that takes ((x1, x2, x3), (Y1, Y2, y3)) e R? x R? 
to x1 yı + x3y3 is not an inner product on R°. 


Suppose F = R and V ¥ {0}. Replace the positivity condition (which 
states that (v,v) > 0 for all v € V) in the definition of an inner product 
(6.3) with the condition that (v,v) > 0 for some v € V. Show that this 
change in the definition does not change the set of functions from V x V 
to R that are inner products on V. 


Suppose V is a real inner product space. 


(a) Show that {u + v,u — v) = |lu||? — ||v||? for every u,v € V. 


(b) Show that if u,v € V have the same norm, then u +v is orthogonal 
tou—v. 


(c) | Use part (b) to show that the diagonals of a rhombus are perpen- 
dicular to each other. 


Suppose T € L(V) is such that ||Tv|| < ||v|| for every v € V. Prove that 
T — /21 is invertible. 


Suppose u,v € V. Prove that (u, v} = 0 if and only if 
lull < |u + av] 
foralla € F. 


Suppose u,v € V. Prove that ||au + bv|| = ||bu + av|| for all a,b € R 
if and only if ||u]| = livi]. 


Suppose u,v € V and ||u|| = ||v|| = 1 and {u,v} = 1. Prove that u = v. 


Suppose u,v € V and ||u|| < 1 and ||v|| < 1. Prove that 


y1- lul? y1- Ivl? < 1- Iu, v). 


Find vectors u,v € RÊ? such that u is a scalar multiple of (1,3), v is 
orthogonal to (1,3), and (1,2) = u + v. 
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16 


CHAPTER 6 Inner Product Spaces 


Prove that 


1 1 1 1 
16<(a+b+c+d) +—+-+ 
a b c a 


for all positive numbers a,b,c,d. 


Prove that 

(x1 tes + Xn)? < n(x? +++) + Xn”) 
for all positive integers n and all real numbers x1,..., Xn. 
Suppose u, v are nonzero vectors in R?. Prove that 


(u,v) = ||u||||vl| cos 6, 


where @ is the angle between u and v (thinking of u and v as arrows with 
initial point at the origin). 

Hint: Draw the triangle formed by u, v, and u — v; then use the law of 
cosines. 


The angle between two vectors (thought of as arrows with initial point at 
the origin) in R? or R? can be defined geometrically. However, geometry 
is not as clear in R” for n > 3. Thus the angle between two nonzero 
vectors x, y € R” is defined to be 


(x, y) 
Balla 


arccos 


where the motivation for this definition comes from the previous exercise. 
Explain why the Cauchy—Schwarz Inequality is needed to show that this 
definition makes sense. 


Prove that 
n 2 n n b;2 
oe) J 
San) <= (Xo ia?)( 4) 
j=l j=l j= 7 
for all real numbers a1,..., an and bj,..., bn. 
Suppose u,v € V are such that 


lul =3, lu +vll=4, lu —vl] = 6. 


What number does ||v|| equal? 
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Prove or disprove: there is an inner product on R? such that the associated 
norm is given by 

|x, y) || = max{x, y} 
for all (x, y) € R?. 


Suppose p > 0. Prove that there is an inner product on R? such that the 
associated norm is given by 


IG, yl] = @? + y?)? 
for all (x, y) € R? if and only if p = 2. 
Suppose V is a real inner product space. Prove that 


lu + vl? = llu — vil? 


(u,v) = J 


for all u,v € V. 


Suppose V is a complex inner product space. Prove that 


_ lu +v? = llu = vl? + llu + ivl?i = Ilu — ivi? 


u, 

(u.v) ` 
for all u,v € V. 

A norm on a vector space U is a function || ||: U — [0, c0) such 
that ||w|| = 0 if and only if u = 0, |lau|| = |e|||u|| for alla € F 


and all u € U, and ||u + v|| < |lu|| + ||v]] for all u,v € U. Prove 
that a norm satisfying the parallelogram equality comes from an inner 
product (in other words, show that if || || is a norm on U satisfying the 
parallelogram equality, then there is an inner product ( , ) on U such 
that ||ul| = (u, u) !/? for all u € U). 


Show that the square of an average is less than or equal to the average 
of the squares. More precisely, show that if a1,..., an € R, then the 
square of the average of a1, ..., 4an is less than or equal to the average 


ofa1?,..., an”. 


Suppose V;,..., Vm are inner product spaces. Show that the equation 
((u4, cee um), (14, saa ,Vm)) = (u1, V1) + aa + (ims Vm) 


defines an inner product on Vj x +--+ X Vm. 

[In the expression above on the right, (u1, v1) denotes the inner product 
on Vj,..., (Um; Vm) denotes the inner product on Vm. Each of the spaces 
Vigeary Vin may have a different inner product, even though the same 
notation is used here.] 


178 


24 


25 


26 


27 


28 


29 


CHAPTER 6 Inner Product Spaces 


Suppose S € L(V) is an injective operator on V. Define (-,-)1 by 
(u,v), = (Su, Sv) 
for u,v € V. Show that (-,-)1 is an inner product on V. 


Suppose S € L(V) is not injective. Define (-, -)1 as in the exercise above. 
Explain why (-,-); is not an inner product on V. 


Suppose f, g are differentiable functions from R to R”. 


(a) Show that 
FOO = F'O + (FO, 8’). 
(b) Suppose c > 0 and || f(4)|| = c for every t € R. Show that 
(f(t), fŒ) = 0 for every t € R. 
(c) Interpret the result in part (b) geometrically in terms of the tangent 


vector to a curve lying on a sphere in R” centered at the origin. 


[For the exercise above, a function f : R —> R” is called differentiable 
if there exist differentiable functions f\,..., Jn from R to R such that 
SFO = (AO, fa (t)) for eacht € R. Furthermore, for eacht € R, 
the derivative f'(t) € R” is defined by f'(t) = (AO, 1085 fn'(t))-] 


Suppose u,v, w € V. Prove that 


lw- ull? + llw- vl? _ lu- vl? 
2 4 


lw- 30 +»)? = 


Suppose C is a subset of V with the property that u,v € C implies 
i (u +v) e C. Letw € V. Show that there is at most one point in C 
that is closest to w. In other words, show that there is at most one u € C 
such that 

|w — ull < |w—v]| forallv eC. 


Hint: Use the previous exercise. 
For u,v € V, define d(u,v) = ||u — v||. 


(a) Show that d is a metric on V. 

(b) Show that if V is finite-dimensional, then d is a complete metric 
on V (meaning that every Cauchy sequence converges). 

(c) | Show that every finite-dimensional subspace of V is a closed 
subset of V (with respect to the metric d). 
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30 Fix a positive integer n. The Laplacian Ap of a twice differentiable 


31 


function p on R” is the function on R” defined by 


3? p 3? p 
Ap = E p . 
p əx? 3x2 


The function p is called harmonic if Ap = 0. 


A polynomial on R” is a linear combination of functions of the 
form x1”! --- xn™””, where m1, ..., Mn are nonnegative integers. 


Suppose q is a polynomial on R”. Prove that there exists a harmonic 
polynomial p on R” such that p(x) = q(x) for every x € R” with 
|||] = 1. 

[The only fact about harmonic functions that you need for this exercise 
is that if p is a harmonic function on R” and p(x) = 0 for all x € R” 
with ||x|| = 1, then p = 0.] 


Hint: A reasonable guess is that the desired harmonic polynomial p is of 
the form q + (1 — ||x||?)r for some polynomial r. Prove that there is a 
polynomial r on R” such that q + (1 — ||x]|2)r is harmonic by defining 
an operator T on a suitable vector space by 


Tr= A(( — I|x||7)r) 
and then showing that T is injective and hence surjective. 


Use inner products to prove Apollonius’s Identity: In a triangle with 
sides of length a, b, and c, let d be the length of the line segment from 
the midpoint of the side of length c to the opposite vertex. Then 


a? + b? = 1e? +2d?. 
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6.23 Definition orthonormal 


e A list of vectors is called orthonormal if each vector in the list has 
norm 1 and is orthogonal to all the other vectors in the list. 


e In other words, a list €1,...,@ of vectors in V is orthonormal if 
i iH jf he, 
(ej ek) = A 
O tet ae 


6.24 Example orthonormal lists 


(a) The standard basis in F” is an orthonormal list. 


(b) ( = A 1 ). ( e z0) is an orthonormal list in F°. 


V3’ A3? ABN V2 
(c) (Fa B z7): C5 z0) (J RTR) is an orthonormal list 
in F?. 


Orthonormal lists are particularly easy to work with, as illustrated by the 
next result. 


6.25 The norm of an orthonormal linear combination 


If ei,...,em is an orthonormal list of vectors in V, then 
aiei +: + Toal = |a,|? Se yog dL lal 


foralek a an € F. 


Proof Because each e; has norm 1, this follows easily from repeated appli- 
cations of the Pythagorean Theorem (6.13). 7 


The result above has the following important corollary. 


6.26 An orthonormal list is linearly independent 


Every orthonormal list of vectors is linearly independent. 
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Proof Suppose é1,...,@m is an orthonormal list of vectors in V and 
aj,...,dm €E F are such that 
ayey +++: + amem = 0. 
Then |a1|? +--+ + |am|? = 0 (by 6.25), which means that all the a;’s are 0. 


Thus e1,...,@m is linearly independent. a 


6.27 Definition orthonormal basis 


An orthonormal basis of V is an orthonormal list of vectors in V that is 
also a basis of V. 


For example, the standard basis is an orthonormal basis of F”. 


6.28 An orthonormal list of the right length is an orthonormal basis 


Every orthonormal list of vectors in V with length dim V is an orthonormal 
basis of V. 


Proof By 6.26, any such list must be linearly independent; because it has the 


right length, it is a basis—see 2.39. E 


6.29 Example Show that 
(353+ 35 ah DTE) (ETET 2) Oa 9) 


is an orthonormal basis of F4. 


Solution We have 


2 2 2 2 
esad +@) E G = 
Similarly, the other three vectors in the list above also have norm 1. 
We have 


1 1 1 1\ /11 1 1\_1 1] 1 131 1), 1 Py 
(lz: 2° 2? es p95) So Toa ro" (=) tz’ (—3) =0. 
Similarly, the inner product of any two distinct vectors in the list above also 
equals 0. 

Thus the list above is orthonormal. Because we have an orthonormal list of 


length four in the four-dimensional vector space F4, this list is an orthonormal 
basis of F4 (by 6.28). 
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In general, given a basis e1,..., en of V and a vector v € V, we know that 
there is some choice of scalars a1,...,@, € F such that 


v = 41€1 +: + anen. 


The importance of orthonormal Computing the numbers a1,..., an that 
bases stems mainly from the next satisfy the equation above can be diffi- 
result. cult for an arbitrary basis of V. The 


next result shows, however, that this is 
easy for an orthonormal basis—just take 
aj = (v,e;). 


6.30 Writing a vector as linear combination of orthonormal basis 


Suppose e1, ..., en is an orthonormal basis of V and v € V. Then 


v = (v,e)er +--+ + (v, en)en 


and 
3 3 2 
lHa = nA TE e 
Proof Because e1,...,en is a basis of V, there exist scalars a1,...,@, such 
that 
v = 411 +: + anen. 
Because e1, ..., en is orthonormal, taking the inner product of both sides of 


this equation with e; gives (v,e;) = aj. Thus the first equation in 6.30 holds. 
The second equation in 6.30 follows immediately from the first equation 
and 6.25. m 


Now that we understand the usefulness of orthonormal bases, how do we 
go about finding them? For example, does Pm (R), with inner product given 
by integration on [—1, 1] [see 6.4(c)], have an orthonormal basis? The next 
result will lead to answers to these questions. 

The algorithm used in the next proof 
is called the Gram-Schmidt Procedure. 
It gives a method for turning a linearly 
independent list into an orthonormal list 
with the same span as the original list. 


Danish mathematician Jørgen 
Gram (1850-1916) and German 
mathematician Erhard Schmidt 
(1876-1959) popularized this algo- 
rithm that constructs orthonormal 
lists. 
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6.31 Gram—Schmidt Procedure 


Suppose v1,...,Vm is a linearly independent list of vectors in V. Let 
e1 =¥1/|\vi|. For j = 2,...,m, define e; inductively by 
vy = jenen = = Wieje 
ei, = 7 
l = i een = e eean] 
Then e1, ..., em is an orthonormal list of vectors in V such that 
span(vı,..., vj) = span(e1,..., ej) 


Or jf =a 


Proof We will show by induction on j that the desired conclusion holds. To 
get started with j = 1, note that span(v;) = span(e 1) because vj is a positive 
multiple of e1. 

Suppose 1 < j < m and we have verified that 


6.32 span(vı,...,Vj—1) = span(e1,...,€j—1). 


Note that v; ¢ span(vı, ...,vj—1) (because v1,..., Vm is linearly indepen- 
dent). Thus v; ¢ span(e1,...,ej—1). Hence we are not dividing by 0 in the 
definition of e; given in 6.31. Dividing a vector by its norm produces a new 
vector with norm 1; thus |le;|| = 1. 

Let 1 < k < j. Then 


vj — (vj, e1}ei == (Vj, ej-1)€j-1 
(ej, ek} = ( „ep 
lv; — wj eijer =: (vj ej-iej- ll 
_ (vj ek) — (Vj,ek) 
lv; = Wy. ei per == Wy ej-)ej- ll 
Thus e1,..., ej is an orthonormal list. 
From the definition of e; given in 6.31, we see that v; € span(e1,..., €j). 


Combining this information with 6.32 shows that 
span(vj,..., vj) C span(ey,..., ej). 


Both lists above are linearly independent (the v’s by hypothesis, the e’s by 
orthonormality and 6.26). Thus both subspaces above have dimension j, and 
hence they are equal, completing the proof. m 
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6.33 Example Find an orthonormal basis of P2(R), where the inner prod- 
uct is given by (p,q) = i P(x)q(x) dx. 


Solution We will apply the Gram-Schmidt Procedure (6.31) to the basis 


Leo. 


To get started, with this inner product we have 
1 

I1? = / 17 dx =2. 
-1 


Thus ||1|| = V2, and hence e1 = \/4. 
Now the numerator in the expression for e2 is 


— (x,e1)e1 =x-(f xy$ax) yf =x 


1 
Il x ||? =f sas = Z, 


Thus ||x|| = rE and hence e2 = 3. 
Now the numerator in the expression for e3 is 


We have 


x? — (x?, eye — (x? ,€2)€2 


=x? — a aia -(f = awak) y 
-3 
We have : 
Ix? HP f (x43? + Bax = Zé. 
Thus ||x* — +l = o and hence e3 = S(x? — 1). 


Thus 
1 3 /45(..2 
J: Ji V 8 (x 


is an orthonormal list of length 3 in P2 (R). Hence this orthonormal list is an 
orthonormal basis of P2 (R) by 6.28. 
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Now we can answer the question about the existence of orthonormal bases. 


6.34 Existence of orthonormal basis 


Every finite-dimensional inner product space has an orthonormal basis. 


Proof Suppose V is finite-dimensional. Choose a basis of V. Apply the 
Gram-Schmidt Procedure (6.31) to it, producing an orthonormal list with 
length dim V. By 6.28, this orthonormal list is an orthonormal basis of V. m 


Sometimes we need to know not only that an orthonormal basis exists, but 
also that every orthonormal list can be extended to an orthonormal basis. In 
the next corollary, the Gram-Schmidt Procedure shows that such an extension 
is always possible. 


6.35 Orthonormal list extends to orthonormal basis 


Suppose V is finite-dimensional. Then every orthonormal list of vectors 
in V can be extended to an orthonormal basis of V. 


Proof Suppose ej,...,@m is an orthonormal list of vectors in V. Then 
€1,.-.,@m 1s linearly independent (by 6.26). Hence this list can be extended to 
a basis €1,...,@m,V1,---,Vn of V (see 2.33). Now apply the Gram-Schmidt 
Procedure (6.31) to e1,...,@m,V1,---,Vn, producing an orthonormal list 
6.36 Elices Ema dei 


here the formula given by the Gram-Schmidt Procedure leaves the first m 
vectors unchanged because they are already orthonormal. The list above is an 
orthonormal basis of V by 6.28. E 


Recall that a matrix is called upper triangular if all entries below the 
diagonal equal 0. In other words, an upper-triangular matrix looks like this: 


* * 
0 * 

where the 0 in the matrix above indicates that all entries below the diagonal 

equal 0, and asterisks are used to denote entries on and above the diagonal. 
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In the last chapter we showed that if V is a finite-dimensional complex 
vector space, then for each operator on V there is a basis with respect to 
which the matrix of the operator is upper triangular (see 5.27). Now that we 
are dealing with inner product spaces, we would like to know whether there 
exists an orthonormal basis with respect to which we have an upper-triangular 
matrix. 

The next result shows that the existence of a basis with respect to which 
T has an upper-triangular matrix implies the existence of an orthonormal 
basis with this property. This result is true on both real and complex vector 
spaces (although on a real vector space, the hypothesis holds only for some 
operators). 


6.37 Upper-triangular matrix with respect to orthonormal basis 


Suppose T € L(V). If T has an upper-triangular matrix with respect to 
some basis of V, then T has an upper-triangular matrix with respect to 
some orthonormal basis of V. 


Proof Suppose T has an upper-triangular matrix with respect to some basis 


V1,...,Vmn Of V. Thus span(vj,...,v,;) is invariant under T for each j = 
1,..., (see 5.26). 

Apply the Gram-Schmidt Procedure to v1,..., vn, producing an orthonor- 
mal basis e;,...,e, of V. Because 

span(e1,...,@;) = span(,...,v;) 

for each j (see 6.31), we conclude that span(e;,...,e@;) is invariant under T 
foreach j = 1,...,”. Thus, by 5.26, T has an upper-triangular matrix with 
respect to the orthonormal basis e1,..., en. E 


Connanmatemaician Isai Schur © The next result is an important appli- 
(1875-1941) published the first), Cation of the result above. 
proof of the next result in 1909. 


6.38 Schur’s Theorem 


Suppose V is a finite-dimensional complex vector space and T € L(V). 
Then T has an upper-triangular matrix with respect to some orthonormal 
basis of V. 


Proof Recall that T has an upper-triangular matrix with respect to some basis 
of V (see 5.27). Now apply 6.37. E 
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Linear Functionals on Inner Product Spaces 


Because linear maps into the scalar field F play a special role, we defined a 
special name for them in Section 3.F. That definition is repeated below in 
case you skipped Section 3.F. 


6.39 Definition linear functional 


A linear functional on V is a linear map from V to F. In other words, a 
linear functional is an element of L(V, F). 


6.40 Example The function yg: F? — F defined by 
p(Z1, 22,23) = 221 — 522 + 23 
is a linear functional on F?. We could write this linear functional in the form 
g(z) = (z,u) 


for every Z € F?, where u = (2,—5, 1). 


6.41 Example The function g: P2(R) — R defined by 


1 
o(p) = D p(t) (cos(xt)) dt 


is a linear functional on P2(R) (here the inner product on P2(R) is multi- 
plication followed by integration on [—1, 1]; see 6.33). It is not obvious that 
there exists u € P2(R) such that 


o(p) = (p.u) 


for every p € P2 (R) [we cannot take u(t) = cos(x t) because that function 
is not an element of P2(R)]. 


Ifu € V, then the map that sends [The next result is named in honor of 
v to (v, u} is a linear functional on V. Hungarian mathematician Frigyes 
The next result shows that every linear | Riesz (1880-1956), who proved 
functional on V is of this form. Ex- |several results early in the twen- 
ample 6.41 above illustrates the power |tieth century that look very much 
of the next result because for the linear ie tne result below. 
functional in that example, there is no 
obvious candidate for u. 
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6.42 Riesz Representation Theorem 


Suppose V is finite-dimensional and ¢ is a linear functional on V. Then 
there is a unique vector u € V such that 


ov) = (v,u) 


for every v € V. 


Proof First we show there exists a vector u € V such that y(v) = (v, u} for 
every v E V. Let e,..., en be an orthonormal basis of V. Then 


pv) = p({v, e1)e1 + +++ + (v, en) en) 
= (v,e1)p(e1) +++: + (V, en) P(En) 
= (v, p(e1)e1 + + plen)en) 


for every v € V, where the first equality comes from 6.30. Thus setting 
6.43 u = g(e1)e1 +- + G(en)en, 


we have ¢(v) = (v, u) for every v € V, as desired. 
Now we prove that only one vector u € V has the desired behavior. 
Suppose u1, u2 € V are such that 


p(v) = (v, u1) = (v, u2) 
for every v € V. Then 
0 = (v, u1) — (v, u2) = (v, u1 — u2) 
for every v € V. Taking v = uy — u2 shows that uı — u2 = 0. In other words, 


Uy = u2, completing the proof of the uniqueness part of the result. E 


6.44 Example Find u € P2(R) such that 


1 1 
J p(t)(cos(xt)) dt = / p(t)u(t) dt 
—1 =í 


for every p € P2(R). 
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Solution Let (p) = IA p(t)(cos(xt)) dt. Applying formula 6.43 from 
the proof above, and using the orthonormal basis from Example 6.33, we have 


u(x) = (f V/3(cos(ort)) at) {3+ (f 31 (cos(xrt)) dt) 3x 
IR Fre +) (cos(zt)) dt) Ere — $). 


A bit of calculus shows that 
u(x) = -55 5 (x? = +). 


Suppose V is finite-dimensional and ø a linear functional on V. Then 6.43 
gives a formula for the vector u that satisfies g(v) = (v,u) for all v € V. 
Specifically, we have 


u = g(e1)e1 +: + Glen)en. 


The right side of the equation above seems to depend on the orthonormal 
basis €1,...,@, as well as on gy. However, 6.42 tells us that u is uniquely 
determined by ø. Thus the right side of the equation above is the same 
regardless of which orthonormal basis e1,...,@, of V is chosen. 


EXERCISES 6.B 


1 (a) Suppose 0 € R. Show that (cos 0, sin 0), (— sin 6, cos 0) and 
(cos 6, sin @), (sin 0, — cos 0) are orthonormal bases of R?. 


(b) Show that each orthonormal basis of R? is of the form given by 
one of the two possibilities of part (a). 


2 Suppose e1,...,@m is an orthonormal list of vectors in V. Let v € V. 
Prove that 
lv? = Iv, en)? + ++ + LY, em) I? 
if and only if v € span(e1,...,@m). 
3 Suppose T € £(R°) has an upper-triangular matrix with respect to 
the basis (1,0, 0), (1, 1, 1), (1, 1,2). Find an orthonormal basis of R? 


(use the usual inner product on R°) with respect to which T has an 
upper-triangular matrix. 
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Suppose n is a positive integer. Prove that 


l] cosx cos2x cosnx sinx sin2x sinnx 
Jon ST’ JT pees Jn? Jn’ Jao” Ja 


is an orthonormal list of vectors in C [—z, 7], the vector space of contin- 
uous real-valued functions on [—z, z] with inner product 


(e= f] Sga) dx. 


[The orthonormal list above is often used for modeling periodic phenom- 
ena such as tides.] 


On P2(R), consider the inner product given by 


1 
(p.q) =j D(x)q(x) dx. 


Apply the Gram-Schmidt Procedure to the basis 1, x, x? to produce an 
orthonormal basis of P2(R). 


Find an orthonormal basis of P2 (R) (with inner product as in Exercise 5) 
such that the differentiation operator (the operator that takes p to p’) 
on P2(R) has an upper-triangular matrix with respect to this basis. 


Find a polynomial q € P2(R) such that 
1 
(3) = f pood 

for every p € P2(R). 

Find a polynomial q € P2(R) such that 

1 1 
J p(x)(cos ax) dx = J p(x)q(x)dx 
0 0 


for every p € P2(R). 


What happens if the Gram-Schmidt Procedure is applied to a list of 
vectors that is not linearly independent? 


10 


11 


12 


13 


14 


15 
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Suppose V is a real inner product space and v1, ..., Vm is a linearly inde- 
pendent list of vectors in V. Prove that there exist exactly 2” orthonormal 
lists €1, ... , €m of vectors in V such that 


span(vi,...,vj;) = span(e1,...,e;) 
for all j € {1,..., m}. 


Suppose (-,-); and (-,-)2 are inner products on V such that (v,w)1; = 0 
if and only if (v, w)2 = 0. Prove that there is a positive number c such 
that (v, w)1 = c (v, w)2 for every v,w € V. 


Suppose V is finite-dimensional and (-,-}1, (-,-)2 are inner products on 
V with corresponding norms ||- ||; and || - ||2. Prove that there exists a 


positive number c such that 
lvl < ellvll2 


for every v € V. 


Suppose v1,..., Vm is a linearly independent list in V. Show that there 
exists w € V such that (w,v;) > 0 forall j € {1,...,m}. 
Suppose é1,..., en is an orthonormal basis of V and v1,..., Vy are 


vectors in V such that 


1 
lle; —vyll < = 


Jn 


for each j. Prove that v;,..., Vv, is a basis of V. 


Suppose Cr([{—1, 1]) is the vector space of continuous real-valued func- 
tions on the interval [—1, 1] with inner product given by 


1 
(fg) = J fg) dx 


for f,g € Cr([-1, 1]). Let ọ be the linear functional on Cr ({—1, 1]) 
defined by ø( f) = f (0). Show that there does not exist g € Cr ([-1, 1]) 
such that 


of) = (fe) 


for every f € Cr({—1, 1). 

[The exercise above shows that the Riesz Representation Theorem (6.42) 
does not hold on infinite-dimensional vector spaces without additional 
hypotheses on V and ¢@.] 
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16 Suppose F = C, V is finite-dimensional, T € L(V), all the eigenvalues 
of T have absolute value less than 1, and € > 0. Prove that there exists a 
positive integer m such that ||T”v|| < e€||v|| for every v € V. 


17 


For u € V, let Pu denote the linear functional on V defined by 


(bu)(v) = (v, u) 


for v E€ V. 

(a) Show that if F = R, then © is a linear map from V to V”. (Recall 
from Section 3.F that V’ = L(V, F) and that V’ is called the dual 
space of V.) 

(b) Show that if F = C and V Æ {0}, then © is not a linear map. 

(c) Show that © is injective. 

(d) Suppose F = R and V is finite-dimensional. Use parts (a) and (c) 


and a dimension-counting argument (but without using 6.42) to 
show that ® is an isomorphism from V onto V”. 


[Part (d) gives an alternative proof of the Riesz Representation Theorem 
(6.42) when F = R. Part (d) also gives a natural isomorphism (meaning 
that it does not depend on a choice of basis) from a finite-dimensional 
real inner product space onto its dual space. | 
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6.C Orthogonal Complements and 
Minimization Problems 


Orthogonal Complements 


6.45 Definition orthogonal complement, Ut 


If U is a subset of V, then the orthogonal complement of U, denoted U +, 
is the set of all vectors in V that are orthogonal to every vector in U: 


Ut = {v € V : (v,u) = 0 for every u € U}. 


For example, if U is a line in R?, then U+ is the plane containing the 
origin that is perpendicular to U. If U is a plane in R3, then U+ is the line 
containing the origin that is perpendicular to U. 


6.46 Basic properties of orthogonal complement 


(a) IfU is a subset of V, then U lisa subspace of V. 
(b) {0} =r. 

© V+ = {0}. 

(d) IfU is a subset of V, then U N UŁ c {0}. 


(e) IfU and W are subsets of V and U C W, then W+ c Ut. 


Proof 


(a) Suppose U is a subset of V. Then (0, u) = O for every u € U; thus 
0eUHt. 


Suppose v, w € UŁ. If u € U, then 
(v+ w, u) = (v, u) + (w,u) =0+0=0. 
Thus v + w € UŁ. In other words, U + is closed under addition. 
Similarly, suppose A € F and v € Ut. If u € U, then 
(àv, u) = Aly) = A420 =, 


Thus Av € Ut. In other words, U+ is closed under scalar multiplica- 
tion. Thus U+ is a subspace of V. 
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(b) Suppose v € V. Then (v,0) = 0, which implies that v € {0}+. Thus 
t = V. 


(c) Suppose v € V+. Then (v, v} = 0, which implies that v = 0. Thus 
V+ = {0}. 


(d) Suppose U is a subset of V and v € U N U+. Then (v, v} = 0, which 
implies that v = 0. Thus U N U+ c {0}. 


(e) Suppose U and W are subsets of V and U C W. Suppose v € W+. 
Then (v, u) = 0 for every u € W, which implies that (v, u) = 0 for 
every u € U. Hence v € UŁ. Thus W+ c Ut. E 


Recall that if U, W are subspaces of V, then V is the direct sum of U and 
W (written V = U @ W) if each element of V can be written in exactly one 
way as a vector in U plus a vector in W (see 1.40). 

The next result shows that every finite-dimensional subspace of V leads to 
a natural direct sum decomposition of V. 


6.47 Direct sum of a subspace and its orthogonal complement 


Suppose U is a finite-dimensional subspace of V. Then 
VS Ue Ue, 


Proof First we will show that 
6.48 V =U +UŁ. 


To do this, suppose v € V. Let e1,...,@m be an orthonormal basis of U. 
Obviously 


6.49 v= (v,e1)e1 +--+ (v, em)em +v — (v, e1)e1 — +- — (v, €m)€m - 
Let u and w be defined as in the equation above. Clearly u € U. Because 
e1, ..., €m is an orthonormal list, for each j = 1,...,m we have 


(w,e;) = (v ej) — (v,e;) 
= 0. 


Thus w is orthogonal to every vector in span(e1,..., €m). In other words, 
w e UŁ. Thus we have written v = u + w, where u € U and w € Ut, 
completing the proof of 6.48. 

From 6.46(d), we know that U N UŁ = {0}. Along with 6.48, this implies 
that V = U @U+ (see 1.45). a 
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Now we can see how to compute dim U+ from dim U. 


6.50 Dimension of the orthogonal complement 


Suppose V is finite-dimensional and U is a subspace of V. Then 


dim Ut = dim V — dim U. 


Proof The formula for dim U + follows immediately from 6.47 and 3.78. m 


The next result is an important consequence of 6.47. 


6.51 The orthogonal complement of the orthogonal complement 


Suppose U is a finite-dimensional subspace of V. Then 
U = (U 
Proof First we will show that 
6.52 U c (UHH. 


To do this, suppose u € U. Then (u,v) = 0 for every v € UŁ (by the 
definition of U +). Because u is orthogonal to every vector in U+, we have 
u e (U+)+, completing the proof of 6.52. 

To prove the inclusion in the other direction, suppose v € (Ut+)+. By 
6.47, we can write v = u + w, where u € U andw € UŁ. We have 
v—u = w € UŁ. Because v € (Ut)+ and u € (U+)+ (from 6.52), we 
have v — u € (U+)+. Thus v — u € UŁ N (U+)+, which implies that v — u 
is orthogonal to itself, which implies that v — u = 0, which implies that 
v = u, which implies that v € U. Thus (U+)+ C U, which along with 6.52 
completes the proof. E 


We now define an operator Py for each finite-dimensional subspace of V. 


6.53 Definition orthogonal projection, Py 


Suppose U is a finite-dimensional subspace of V. The orthogonal 
projection of V onto U is the operator Py € L(V) defined as follows: 
For v € V, write v = u + w, where u € U and w € U+. Then Pyv=u. 
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The direct sum decomposition V = U @ U+ given by 6.47 shows that 
each v € V can be uniquely written in the form v = u + w with u € U and 
w € Ut. Thus Pyv is well defined. 


6.54 Example Suppose x € V with x 4 0 and U = span(x). Show that 


(v, x) 
[||| 


Pyv= x 


for every v € V. 


Solution 


Suppose v € V. Then 


p 


o lP ||| 


where the first term on the right is in span(x) (and thus in U) and the second 
term on the right is orthogonal to x (and thus is in UŁ). Thus Py v equals the 
first term on the right, as desired. 


6.55 Properties of the orthogonal projection Py 


Suppose U is a finite-dimensional subspace of V and v € V. Then 


(a) 


Py € L(V); 

Pyu = u for every u € U; 

Pyw = 0 for every w € Ut; 

range Py = U; 

null Py = UŁ; 

v— Pepe 

Py = Pye 

|Puvll < Ilvils 

for every orthonormal basis e1,...,@m of U, 


Pyv = (v,e1)e1 +++: + (v, em)em.- 
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Proof 


(a) 


(b) 


(c) 


(d) 


(h) 


(i) 


To show that Py is a linear map on V, suppose v1, v2 € V. Write 
vy=uytw, and w=u+wW2 


with u1, u2 E€ U andwy,w2 € U+. Thus Pyvı = uy, and Pyv2 = u2. 
Now 
vı + v2 = (uy + u2) + (w1 + w2), 
where u1 + u2 € U and wı + w2 € UŁ. Thus 
Py (vı + v2) = u1 + u2 = Pyviı + Povo. 


Similarly, suppose A € F. The equation v = u + w with u € U and 
w € UŁ implies that Av = Au + Aw with Au € U and Aw € UŁ. 
Thus Py (Av) = Au = APyv. 


Hence Py is a linear map from V to V. 


Suppose u € U. We can write u = u + 0, where u € U and 0 € Ut. 
Thus Pyu = u. 


Suppose w € U+. We can write w = 0+ w, where 0 € U and w € U+. 
Thus Pyw = 0. 


The definition of Py implies that range Py C U. Part (b) implies that 
U C range Py. Thus range Py = U. 


Part (c) implies that UŁ C null Py. To prove the inclusion in the other 
direction, note that if v € null Py then the decomposition given by 6.47 
must be v = 0 + v, where 0 € U and v € UŁ. Thus null Py C Ut. 


Ifv = u + w with u € U and w € UŁ, then 
v— Pyv=v—-u=wett. 
Ifv = u + w with u € U and w € UŁ, then 
(Py®v = Py(Puv) = Pyu = u = Pyv. 
Ifv = u + w with u € U and w € UŁ, then 
| Puvll? = lul? < ul? + lwl? = Ivl’, 
where the last equality comes from the Pythagorean Theorem. 


The formula for Pyv follows from equation 6.49 in the proof of 6.47. m 
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Minimization Problems 


The following problem often arises: 
given a subspace U of V and a point 
v € V, find a point u € U such that 
|v — ulļ| is as small as possible. The 
next proposition shows that this mini- 
mization problem is solved by taking 
u = Pyv. 


The remarkable simplicity of the so- 
lution to this minimization problem 
has led to many important applica- 
tions of inner product spaces out- 
side of pure mathematics. 


6.56 Minimizing the distance to a subspace 


Suppose U is a finite-dimensional subspace of V, v € V, and u € U. Then 
lv- Puy < llv — ul. 
Furthermore, the inequality above is an equality if and only if u = Pyv. 


Proof We have 


6.57 lv- Pyvl? < lv- Puvl? + ||Puv — ull? 
= |v — Puv) + (Puy — u)||? 
= |v- ul’, 


where the first line above holds because 0 < ||Pyv — ull?, the second 
line above comes from the Pythagorean Theorem [which applies because 
v— Pyve Ut by 6.55(f), and Pyv — u € U], and the third line above holds 
by simple algebra. Taking square roots gives the desired inequality. 

Our inequality above is an equality if and only if 6.57 is an equality, 
which happens if and only if || Pyv — u|| = 0, which happens if and only if 
u = Pyv. E 


0 


Pyv is the closest point in U to v. 
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The last result is often combined with the formula 6.55(4) to compute 
explicit solutions to minimization problems. 


6.58 Example Finda polynomial u with real coefficients and degree at 
most 5 that approximates sin x as well as possible on the interval [—z, 7], in 
the sense that “ 
J | sin x — u(x)|? dx 
= 


is as small as possible. Compare this result to the Taylor series approximation. 


Solution Let Cr[—r, 7] denote the real inner product space of continuous 
real-valued functions on [—z, 7x] with inner product 


6.59 (fg) = Í fg) dx. 


Let v € Cr[—, 7] be the function defined by v(x) = sin x. Let U denote the 
subspace of Cr[—z, 7x] consisting of the polynomials with real coefficients 
and degree at most 5. Our problem can now be reformulated as follows: 


Find u € U such that ||v — u|| is as small as possible. 


To compute the solution to our ap- [4 computer that can perform inte- 
proximation problem, first apply the | grations is useful here. 


Gram-Schmidt Procedure (using the in- 7 

ner product given by 6.59) to the basis 1, x, x2,x3,x4, x of U, producing 
an orthonormal basis e1, e2, €3, €4, 65, eg of U. Then, again using the inner 
product given by 6.59, compute Py v using 6.55(i) (with m = 6). Doing this 
computation shows that Pyv is the function u defined by 


6.60 u(x) = 0.987862x — 0.155271x3 + 0.00564312x°9, 


where the x’s that appear in the exact answer have been replaced with a good 
decimal approximation. 

By 6.56, the polynomial u above is the best approximation to sinx on 
[—z, x] using polynomials of degree at most 5 (here “best approximation” 
means in the sense of minimizing (2. | sin x — u(x)|? dx). To see how good 
this approximation is, the next figure shows the graphs of both sin x and our 
approximation u(x) given by 6.60 over the interval [—z, zr]. 
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-1 


Graphs on |—x, x] of sin x (blue) and 
its approximation u(x) (red) given by 6.60. 


Our approximation 6.60 is so accurate that the two graphs are almost 
identical—our eyes may see only one graph! Here the blue graph is placed 
almost exactly over the red graph. If you are viewing this on an electronic 
device, try enlarging the picture above, especially near 3 or —3, to see a small 
gap between the two graphs. 

Another well-known approximation to sin x by a polynomial of degree 5 
is given by the Taylor polynomial 


6.61 x— — 


To see how good this approximation is, the next picture shows the graphs of 
both sin x and the Taylor polynomial 6.61 over the interval [—z, zr]. 


1 


-1 


Graphs on [—1, 1] of sin x (blue) and the Taylor polynomial 6.61 (red). 


The Taylor polynomial is an excellent approximation to sin x for x near 0. 
But the picture above shows that for |x| > 2, the Taylor polynomial is not 
so accurate, especially compared to 6.60. For example, taking x = 3, our 
approximation 6.60 estimates sin 3 with an error of about 0.001, but the Taylor 
series 6.61 estimates sin 3 with an error of about 0.4. Thus at x = 3, the error 
in the Taylor series is hundreds of times larger than the error given by 6.60. 
Linear algebra has helped us discover an approximation to sin x that improves 
upon what we learned in calculus! 
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EXERCISES 6.C 


1 Suppose v1,...,V¥m € V. Prove that 
L 
fvi... Vm} = (span(v1,...,Vm)) f 


2 Suppose U is a finite-dimensional subspace of V. Prove that U+ = {0} 
if and only if U = V. 
[Exercise 14(a) shows that the result above is not true without the hy- 
pothesis that U is finite-dimensional.] 


3 Suppose U is a subspace of V with basis u1, ..., um and 
U1, ..., Um, W1,- -Wn 


is a basis of V. Prove that if the Gram-Schmidt Procedure is applied 
to the basis of V above, producing a list e1,...,@m, fi,.--, fn, then 
2 eee em is an orthonormal basis of U and /i,..., Jn is an orthonor- 
mal basis of UŁ. 


4 Suppose U is the subspace of R4 defined by 
U = span((1, 2, 3, —4), (—5, 4, 3, 2)). 
Find an orthonormal basis of U and an orthonormal basis of Ut. 


5 Suppose V is finite-dimensional and U is a subspace of V. Show that 
Py. = I — Py, where J is the identity operator on V. 


6 Suppose U and W are finite-dimensional subspaces of V. Prove that 
Py Pw = Oif and only if (u, w) = 0 for all u € U and all w € W. 


7 Suppose V is finite-dimensional and P € L(V) is such that P? = P and 
every vector in null P is orthogonal to every vector in range P. Prove 
that there exists a subspace U of V such that P = Py. 


8 Suppose V is finite-dimensional and P € L(V) is such that P? = P 
and 
|Pvll < lvl 


for every v € V. Prove that there exists a subspace U of V such that 
P= Py. 


9 Suppose T € L(V) and U is a finite-dimensional subspace of V. Prove 
that U is invariant under T if and only if Py TPy = TPy. 
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Suppose V is finite-dimensional, T € L(V), and U is a subspace 
of V. Prove that U and U+ are both invariant under T if and only 
if PyT = TPy. 


In R4, let 
U = span((1,1,0,0), (1, 1, 1,2). 
Find u € U such that ||u — (1, 2, 3, 4)|| is as small as possible. 


Find p € P3(R) such that p(0) = 0, p’(0) = 0, and 


1 
J riep dx 
0 


is as small as possible. 


Find p € Ps(R) that makes 


T 
J |sin x — p(x)|? dx 


=F 


as small as possible. 

[The polynomial 6.60 is an excellent approximation to the answer to this 
exercise, but here you are asked to find the exact solution, which involves 
powers of n. A computer that can perform symbolic integration will be 


useful. | 


Suppose Cr([—1, 1]) is the vector space of continuous real-valued func- 
tions on the interval [—1, 1] with inner product given by 


1 
(fg) = [ _Seg(s) ds 


for f,g € Cr([-1, 1]). Let U be the subspace of Cr ([—1, 1]) defined 
by 

U ={f € Cr([-1, 1): f0) = 05. 
(a) Show that U+ = {0}. 


(b) Show that 6.47 and 6.51 do not hold without the finite-dimensional 
hypothesis. 
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Isaac Newton 
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Operators on Inner Product 
Spaces 


The deepest results related to inner product spaces deal with the subject 
to which we now turn—operators on inner product spaces. By exploiting 
properties of the adjoint, we will develop a detailed description of several 
important classes of operators on inner product spaces. 

A new assumption for this chapter is listed in the second bullet point below: 


71 Notation F, V 


e F denotes R or C. 


e V and W denote finite-dimensional inner product spaces over F. 


LEARNING OBJECTIVES FOR THIS CHAPTER 


m adjoint 


Spectral Theorem 


positive operators 


m isometries 
m Polar Decomposition 
m Singular Value Decomposition 
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7.A  Self-Adjoint and Normal Operators 
Adjoints 


7.2 Definition adjoint, T* 


Suppose T € L(V, W). The adjoint of T is the function T*: W > V 
such that 
(Tv, w) = (v, T*w) 


for every v € V and every w € W. 


The word adjoint has another | To see why the definition above 
meaning in linear algebra. We do| makes sense, suppose T € L(V, W). 
not need the second meaning in|, Fix w € W. Consider the linear func- 
this book. In case you encounter | tional on V that maps v € V to (Tv, w); 


the second meaning for adjoint this linear functional depends on T and 
elsewhere, be warned that the two 


i gs w. By the Riesz Representation Theo- 
meanings for adjoint are unrelated | : , 
ae ee rem (6.42), there exists a unique vector 
in V such that this linear functional is 
given by taking the inner product with it. We call this unique vector T*w. In 
other words, T*w is the unique vector in V such that (Tv, w) = (v, T*w) for 
every v E V. 


7.3 Example Define T: R? > R? by 
T (x1, x2, x3) = (x2 + 3x3, 2x1). 
Find a formula for T*. 


Solution Here T* will be a function from R? to R3. To compute T*, fix a 
point (y1, y2) € R?. Then for every (x1, X2, X3) € R? we have 


((x1, x2, x3), T* 1, y2)) = (T (x1, x2, x3), (Y1, Y2)) 
= ((x2 + 3x3, 2x1), (y1, y2)) 
= X2y1 + 3x3y1 + 2x1 y2 
= ((x1, x2, x3), 2y2, y1, 3y1)). 


Thus 
T* (y1, y2) = (2y2, y1, 3y1). 
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7.4 Example Fix u €V andx € W. Define T € L(V, W) by 
Tv = (v,u)x 


for every v € V. Find a formula for T*. 


Solution Fix w € W. Then for every v € V we have 


Thus 


In the two examples above, T* turned out to be not just a function but a 
linear map. This is true in general, as shown by the next result. 

The proofs of the next two results use a common technique: flip 7* from 
one side of an inner product to become T on the other side. 


7.5 The adjoint is a linear map 
If T € L(V, W), then T* € L(W, V). 


Proof Suppose T € L(V, W). Fix w1, w2 E€ W. If v € V, then 


(v, T* (w1 + w2)) = (Tv, wi + w2) 
= (Tv, w1) + (Tv, w2) 
= (v, T*w1) + (v, T*w2) 
= (v,T*w, + T*w2), 


which shows that T* (w1 + w2) = T*wy + T*w2. 
Fixw € W and à € F. Ifv € V, then 


which shows that T* (àw) = AT*w. 
Thus T* is a linear map, as desired. m 
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7.6 Properties of the adjoint 


(a) (S+T7)* = $*+T* forall $,T E LIV,W); 
(b) (AT)* =AT* forall A € F and T € L(V, W); 
(c) (7*)* =T forall T €e L(V, W); 
(d) Z* = I, where 7 is the identity operator on V; 
(e) (ST)* = T*S* forall T € L(V,W) and S € L(W, U) (here U 
is an inner product space over F). 
Proof 
(a) Suppose S,T € L(V, W). If v € V and w € W, then 


(v, (S + T)*w) = ((S + T)v, w) 
= (Sv, w) + (Tv, w) 
= (v, S*w) + (v, T*w) 
= (v, S*w + T*w)}. 

Thus (S + T)*w = S*w + T*w, as desired. 
(b) Suppose A € F and T € L(V, W). Ifv e€ V and w € W, then 

(v, (AT)*w) = (ATv,w) = A (Tv, w) = Alv, T*w) = (v, AT*w). 

Thus (AT)*w = AT*w, as desired. 

(c) Suppose T € L(V, W). Ifv € V and w € W, then 
(w, (P*)"v) = (T*w, v) = (v, T*w) = (Tv, w) = (w, Tv). 


Thus (T*)*v = Tv, as desired. 
(d) Ifv,u € V, then 
(v, I*u) = (Iv, u) = (v, u). 
Thus /*u = u, as desired. 
(e) Suppose T € L(V, W) and S € L(W, U). Ifv € V and u € U, then 
(v, (ST)"u) = (STv, u) = (Tv, S*u) = (v, T*(S*u)}). 
Thus (ST)*u = T*(S*u), as desired. 
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The next result shows the relationship between the null space and the range 
of a linear map and its adjoint. The symbol <=> used in the proof means “if 
and only if”; this symbol could also be read to mean “is equivalent to”. 


7.7 Null space and range of 7* 
Suppose T € L(V, W). Then 

(a) null T* = (range T)-; 
(b) rangeT* = (null T)“; 


(c) nullT = (range T*)=; 


(d) rangeT = (null 7*)-. 


Proof We begin by proving (a). Let w € W. Then 


w € null T* <=> T*w = 0 
<=> (v,T*w) = 0 forallv € V 
<=> (Tv, w) = 0forallv € V 
<=> w € (range T)+. 
Thus null T* = (range T)+, proving (a). 
If we take the orthogonal complement of both sides of (a), we get (d), 


where we have used 6.51. Replacing T with T* in (a) gives (c), where we 
have used 7.6(c). Finally, replacing T with 7* in (d) gives (b). C] 


7.8 Definition conjugate transpose 


The conjugate transpose of an m-by-n matrix is the n-by-m matrix ob- 
tained by interchanging the rows and columns and then taking the complex 
conjugate of each entry. 


7.9 Example 


The conjugate transpose of the matrix If F = R, then the conjugate trans- 
( 2 3+4i 7 ) is the matrix pose of a matrix is the same as its | 
6 5 8i transpose, which is the matrix ob- 

2 6 tained by interchanging the rows 
34i 5 , iie columns. | 


7 —8i 
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The adjoint of a linear map does }J The next result shows how to com- 


not depend on a choice of basis. pute the matrix of T* from the matrix 
This explains why this book em-\, _ of T. 
phasizes adjoints of linear maps Caution: Remember that the result 


instead of conjugate transposes of || below applies only when we are dealing 

besa with orthonormal bases. With respect to 
nonorthonormal bases, the matrix of T* 
does not necessarily equal the conjugate 
transpose of the matrix of 7. 


7.10 The matrix of T* 


Let T € L(V, W). Suppose e1,..., en is an orthonormal basis of V and 
fi, .--, fm is an orthonormal basis of W. Then 


M(T*, (fi. ARD Jae (e1, sions ,€n)) 
is the conjugate transpose of 


NAT Cienen aC hie ten): 


Proof In this proof, we will write M(T) instead of the longer expres- 
sion M(T, (e1,...,€n).(fi,---, fm)); we will also write M(T*) instead 
of M(T*, (fi oer Fin); (e1, avis ,€n)). 

Recall that we obtain the k column of M (T) by writing Te, as a linear 
combination of the /;’s; the scalars used in this linear combination then 
become the k column of M(T). Because fi,..., Ím is an orthonormal 
basis of W, we know how to write Tez as a linear combination of the f;’s 
(see 6.30): 

Tex = (Tex, fi) fi +: + (Tek, fm) Jm. 


Thus the entry in row j, column k, of M(T) is (Tex, fj). 

Replacing T with 7* and interchanging the roles played by the e’s and 
J’s, we see that the entry in row j, column k, of M(T*) is (T* fx, e;), 
which equals ( fk, Tej}, which equals (Te;, fg), which equals the complex 
conjugate of the entry in row k, column j, of M(T). In other words, M(7*) 
is the conjugate transpose of M(T). a 
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Self-Adjoint Operators 


Now we switch our attention to operators on inner product spaces. Thus 
instead of considering linear maps from V to W, we will be focusing on linear 
maps from V to V; recall that such linear maps are called operators. 


7.11 Definition self-adjoint 


An operator T € L(V) is called self-adjoint if T = T*. In other words, 
T € L(V) is self-adjoint if and only if 


(Tv,w) = (v, Tw) 


for all v,w € V. 


7.12 Example Suppose T is the operator on F? whose matrix (with re- 
spect to the standard basis) is 

2 b 

3 a 


Find all numbers b such that T is self-adjoint. 


Solution The operator 7 is self-adjoint if and only if b = 3 (because 
M(T) = M(T*) if and only if b = 3; recall that M(T*) is the conjugate 
transpose of MM(T)—see 7.10). 


You should verify that the sum of two self-adjoint operators is self-adjoint 
and that the product of a real scalar and a self-adjoint operator is self-adjoint. 

A good analogy to keep in mind (es- [Some mathematicians use the term 
pecially when F = C) is that the adjoint | Hermitian instead of self-adjoint, 
on L(V) plays a role similar to complex | honoring French mathematician 
conjugation on C. A complex number Charles Hermite, who in 1873 pub- 
z is real if and only if z = Z; thus a self-_| lished the first proof that e is not a 
adjoint operator (T = T*) is analogous zero of any polynomial with integer 
to a real number. coef aa - - 

We will see that the analogy discussed above is reflected in some important 
properties of self-adjoint operators, beginning with eigenvalues in the next 
result. 

If F = R, then by definition every eigenvalue is real, so the next result is 
interesting only when F = C. 
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7.13 Eigenvalues of self-adjoint operators are real 


Every eigenvalue of a self-adjoint operator is real. 


Proof Suppose T is a self-adjoint operator on V. Let A be an eigenvalue of 
T, and let v be a nonzero vector in V such that Tv = Av. Then 


Allyl? = (av, v) = (Tv, v} = (v, Tv) = (v, Av) = All|? 
Thus à = A, which means that A is real, as desired. E 


The next result is false for real inner product spaces. As an example, 
consider the operator T € L(R?) that is a counterclockwise rotation of 90° 
around the origin; thus T(x, y) = (—y, x). Obviously Tv is orthogonal to v 
for every v € RÊ, even though T + 0. 


7.14 Over C, Tv is orthogonal to v for all v only for the 0 operator 


Suppose V is a complex inner product space and T € L(V). Suppose 
CY) =T 


for all v € V. Then T = 0. 


Proof We have 


(Tu +w), u +w) —(T(u—w),u—w) 
4 
(T (u +iw),u+iw) —(T(u—iw),u—iw) . 
+ 7 l 


(Tu, w) = 


for all u,w € V, as can be verified by computing the right side. Note that 
each term on the right side is of the form (Tv, v) for appropriate v € V. Thus 
our hypothesis implies that (Tu, w} = 0 for all u,w € V. This implies that 


T = 0 (take w = Tu). 7 
The next result provides another ex- The next result is false for real inner 
ample of how self-adjoint opera- product spaces, as shown by consider- 
tors behave like real numbers. ing any operator on a real inner product 


space that is not self-adjoint. 
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7.15 Over C, (Tv, v) is real for all v only for self-adjoint operators 


Suppose V is a complex inner product space and T € L(V). Then T is 
self-adjoint if and only if 
(Tv,v) ER 


for every v € V. 


Proof Letv € V. Then 


(Tv, v)— (Tv, v) = (Tv, v)}— (v, Tv) = (Tv, v)}—(T*v, v) = (T-T*)y, v). 
If (Tv, v) € R for every v € V, then the left side of the equation above equals 
0, so ((T — T*)v, v} = 0 for every v € V. This implies that T — T* = 0 (by 
7.14). Hence T is self-adjoint. 

Conversely, if T is self-adjoint, then the right side of the equation above 
equals 0, so (Tv, v} = (Tv, v} for every v € V. This implies that (Tv, v} € R 
for every v € V, as desired. E 


On a real inner product space V, a nonzero operator T might satisfy 
(Tv, v) = 0 for all v € V. However, the next result shows that this cannot 
happen for a self-adjoint operator. 


fie tPr=T" andin =Oitorally then = 0 


Suppose T is a self-adjoint operator on V such that 
(Tv,v) =0 
for all v € V. Then T = 0. 


Proof We have already proved this (without the hypothesis that T is self- 
adjoint) when V is a complex inner product space (see 7.14). Thus we can 
assume that V is a real inner product space. If u, w € V, then 
(T (u +w), u +w)— (T(u — w), u — w). 

4 ; 
this is proved by computing the right side using the equation 


(Tw, u) = (w, Tu) = (Tu, w), 


7.17 (Tu, w) = 


where the first equality holds because T is self-adjoint and the second equality 
holds because we are working in a real inner product space. 

Each term on the right side of 7.17 is of the form (Tv, v) for appropriate v. 
Thus (Tu, w) = 0 for all u,w € V. This implies that T = 0 (take w = Tu). m 
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Normal Operators 


7.18 Definition normal 


e An operator on an inner product space is called normal if it com- 
mutes with its adjoint. 


e In other words, T € L(V) is normal if 
IRIE = haa 


Obviously every self-adjoint operator is normal, because if T is self-adjoint 
then T* = T. 


7.19 Example Let 7 be the operator on F? whose matrix (with respect to 
the standard basis) is 
2 -3 
(3 2): 


Show that T is not self-adjoint and that T is normal. 


Solution This operator is not self-adjoint because the entry in row 2, column 1 
(which equals 3) does not equal the complex conjugate of the entry in row 1, 
column 2 (which equals —3). 

The matrix of T T* equals 


2 -3 2a 3 ; 13 0 
(3 > ie >) + Which equal ( 0 ee 
Similarly, the matrix of T*T equals 
2 3 2 =3 ; 13 0 
(2 e 2 ) . which equals ( 0 D 


Because TT* and T*T have the same matrix, we see that TT* = T*T. 
Thus T is normal. 


The next result implies that | In the next section we will see why 
null T = null T* for every normal | normal operators are worthy of special 
operator T. attention. 


The next result provides a simple 
characterization of normal operators. 
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7.20 T is normal if and only if ||Tv|| = ||7*v|| for all v 


An operator T € L(V) is normal if and only if 
Tvl] = I7*vI 


for all v € V. 


Proof Let T € L(V). We will prove both directions of this result at the same 
time. Note that 


T is normal => T*T —TT* =0 
23 F TTT ys) =O oraire y 
<=> (T*Tv,v) =(TT*v,v) forallve V 
=> |Tv]? = ||T*v||?_ forall ve V, 


where we used 7.16 to establish the second equivalence (note that the operator 
T*T — TT* is self-adjoint). The equivalence of the first and last conditions 
above gives the desired result. m 


Compare the next corollary to Exercise 2. That exercise states that the 
eigenvalues of the adjoint of each operator are equal (as a set) to the complex 
conjugates of the eigenvalues of the operator. The exercise says nothing 
about eigenvectors, because an operator and its adjoint may have different 
eigenvectors. However, the next corollary implies that a normal operator and 
its adjoint have the same eigenvectors. 


7.21 For T normal, T and T* have the same eigenvectors 


Suppose T € £(V) is normal and v € V is an eigenvector of T with 
eigenvalue A. Then v is also an eigenvector of T* with eigenvalue 1. 


Proof Because T is normal, so is T — AJ, as you should verify. Using 7.20, 
we have 


0 = I(T -4P = I(T —AD)*v] = (T* — ADI. 
Hence v is an eigenvector of T* with eigenvalue A, as desired. E 


Because every self-adjoint operator is normal, the next result applies in 
particular to self-adjoint operators. 
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7.22 Orthogonal eigenvectors for normal operators 


Suppose T € £(V) is normal. Then eigenvectors of T corresponding to 
distinct eigenvalues are orthogonal. 


Proof Suppose @, B are distinct eigenvalues of T, with corresponding eigen- 
vectors u,v. Thus Tu = au and Tv = By. From 7.21 we have T*v = Bv. 
Thus 


v) — (u, pv) 


(œ — p) (u, v) = (au, v) — 
= (Tu, v) — (u, T*v) 
= 0. 


Because a Æ $, the equation above implies that (u, v} = 0. Thus u and v are 
orthogonal, as desired. E 


EXERCISES 7.A 


1 Suppose n is a positive integer. Define T € L(F”) by 
T (zı, bwi Zn) = (0, z1,. oe »Zn—1). 
Find a formula for T*(z1,..., Zn). 


2 Suppose T € L(V) and À € F. Prove that A is an eigenvalue of T if and 
only if A is an eigenvalue of T*. 


3 Suppose T € L(V) and U is a subspace of V. Prove that U is invariant 
under T if and only if U+ is invariant under T*. 


4 Suppose T € L(V, W). Prove that 
(a) T is injective if and only if T* is surjective; 
(b) T is surjective if and only if T* is injective. 
5 Prove that 
dim null 7* = dimnull T + dim W — dim V 


and 
dim range T* = dimrange T 


for every T € L(V, W). 
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Make P2(R) into an inner product space by defining 


1 
(p,q) = [ p(x)q(x) de. 


Define T € £L(P2(R)) by T (ao + a1x + a2x7) = ayx. 


(a) Show that T is not self-adjoint. 
(b) The matrix of T with respect to the basis (1, x, x7) is 


0 0 0 
0 1 0 
0 0 0 


This matrix equals its conjugate transpose, even though T is not 
self-adjoint. Explain why this is not a contradiction. 


Suppose S, T € L(V) are self-adjoint. Prove that ST is self-adjoint if 
and only if ST = TS. 


Suppose V is a real inner product space. Show that the set of self-adjoint 
operators on V is a subspace of L(V). 


Suppose V is a complex inner product space with V Æ {0}. Show that 
the set of self-adjoint operators on V is not a subspace of L(V). 


Suppose dim V > 2. Show that the set of normal operators on V is not a 
subspace of L(V). 


Suppose P € L(V) is such that P? = P. Prove that there is a subspace 
U of V such that P = Py if and only if P is self-adjoint. 


Suppose that T is anormal operator on V and that 3 and 4 are eigenvalues 
of T. Prove that there exists a vector v € V such that ||v|| = 2 and 
vl = 5. 


Give an example of an operator T € £(C*) such that T is normal but 
not self-adjoint. 


Suppose T is a normal operator on V. Suppose also that v, w € V satisfy 
the equations 


lvl = lwl =2, Tv= 3v, Tw = 4w. 


Show that |T @ + w)|| = 10. 
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Fix u,x € V. Define T € L(V) by 
Tv = (wx 
for every v € V. 


(a) Suppose F = R. Prove that T is self-adjoint if and only if u, x is 
linearly dependent. 


(b) Prove that T is normal if and only if u, x is linearly dependent. 
Suppose T € L(V) is normal. Prove that 
range T = range T*. 
Suppose T € L(V) is normal. Prove that 
null 7* = null7 and range TE = range T 
for every positive integer k. 


Prove or give a counterexample: If T € £(V) and there exists an ortho- 
normal basis €1,..., en of V such that ||Te;|| = ||7*e;|| for each j, 
then T is normal. 


Suppose T € £(C?) is normal and T(1,1,1) = (2,2,2). Suppose 
(21, Z2,23) € null T. Prove that zı + z2 + z3 = 0. 


Suppose T € L(V, W) and F = R. Let ®y be the isomorphism from V 
onto the dual space V” given by Exercise 17 in Section 6.B, and let Dw 
be the corresponding isomorphism from W onto W’. Show that if ®y and 
®y are used to identify Vand W with V’ and W’, then T* is identified 
with the dual map T’. More precisely, show that Py o T* = T’ o Op. 


Fix a positive integer n. In the inner product space of continuous real- 
valued functions on [—z, z] with inner product 


T 
e= f Fady 
= 
let 
V = span(1,cos x,cos 2x,...,cosnx,sinx,sin2x,...,sinnx). 


(a) Define D € L(V) by Df = f'. Show that D* = —D. Conclude 
that D is normal but not self-adjoint. 


(b) Define T € L(V) by Tf = f”. Show that T is self-adjoint. 
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7.B | The Spectral Theorem 


Recall that a diagonal matrix is a square matrix that is 0 everywhere except 
possibly along the diagonal. Recall also that an operator on V has a diagonal 
matrix with respect to a basis if and only if the basis consists of eigenvectors 
of the operator (see 5.41). 

The nicest operators on V are those for which there is an orthonormal 
basis of V with respect to which the operator has a diagonal matrix. These 
are precisely the operators T € £(V) such that there is an orthonormal basis 
of V consisting of eigenvectors of T. Our goal in this section is to prove the 
Spectral Theorem, which characterizes these operators as the normal operators 
when F = C and as the self-adjoint operators when F = R. The Spectral 
Theorem is probably the most useful tool in the study of operators on inner 
product spaces. 

Because the conclusion of the Spectral Theorem depends on F, we will 
break the Spectral Theorem into two pieces, called the Complex Spectral 
Theorem and the Real Spectral Theorem. As is often the case in linear algebra, 
complex vector spaces are easier to deal with than real vector spaces. Thus 
we present the Complex Spectral Theorem first. 


The Complex Spectral Theorem 


The key part of the Complex Spectral Theorem (7.24) states that if F = C 
and T € L(V) is normal, then T has a diagonal matrix with respect to some 
orthonormal basis of V. The next example illustrates this conclusion. 


7.23 Example Consider the normal operator T € £(C?) from Example 
7.19, whose matrix (with respect to the standard basis) is 


(; 2): 


CD CLD is an orthonormal basis of C2 consisting of 


As you can verify, 


eigenvectors of T, and with respect to this basis the matrix of T is the diagonal 


matrix 
2+ 3i 0 
0 2-31 J` 


In the next result, the equivalence of (b) and (c) is easy (see 5.41). Thus 
we prove only that (c) implies (a) and that (a) implies (c). 
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7.24 Complex Spectral Theorem 

Suppose F = C and T € L(V). Then the following are equivalent: 
(a) T is normal. 

(b) V has an orthonormal basis consisting of eigenvectors of T. 


(c) T has a diagonal matrix with respect to some orthonormal basis 
of V. 


Proof First suppose (c) holds, so T has a diagonal matrix with respect to 
some orthonormal basis of V. The matrix of T* (with respect to the same 
basis) is obtained by taking the conjugate transpose of the matrix of T; hence 
T* also has a diagonal matrix. Any two diagonal matrices commute; thus T 
commutes with T*, which means that T is normal. In other words, (a) holds. 

Now suppose (a) holds, so T is normal. By Schur’s Theorem (6.38), 
there is an orthonormal basis e1,...,@, of V with respect to which T has an 
upper-triangular matrix. Thus we can write 


41,1 «.-- Ain 
7.25 MT (et, 202 58)) = 
0 ann 


We will show that this matrix is actually a diagonal matrix. 
We see from the matrix above that 


Tei ||? = jail? 


and 
T* er ||? = lar al? + lail? +-+ lara’. 


Because T is normal, ||Te,|| = ||T*e1 || (see 7.20). Thus the two equations 
above imply that all entries in the first row of the matrix in 7.25, except 
possibly the first entry a1,1, equal 0. 

Now from 7.25 we see that 

|| Teall? = la2,21? 
(because a1,2 = 0, as we showed in the paragraph above) and 
\|T*e2 |? = |a2,2|? + la2,3l? ++- + la2,nl?. 

Because T is normal, ||Te2|| = ||7*e2||. Thus the two equations above imply 
that all entries in the second row of the matrix in 7.25, except possibly the 
diagonal entry a2,2, equal 0. 

Continuing in this fashion, we see that all the nondiagonal entries in the 
matrix 7.25 equal 0. Thus (c) holds. m 
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The Real Spectral Theorem 


We will need a few preliminary results, which apply to both real and complex 
inner product spaces, for our proof of the Real Spectral Theorem. 

You could guess that the next result [This technique of completing the |p 
is true and even discover its proof by | square can be used to derive the | 
thinking about quadratic polynomials | quadratic formula. | 
with real coefficients. Specifically, sup- —— 
pose b,c € R and b? < 4c. Let x bea 
real number. Then 


2 


Ptbxte=(x+3) +(c-Z)>0 


In particular, x? + bx + c is an invertible real number (a convoluted way 
of saying that it is not 0). Replacing the real number x with a self-adjoint 
operator (recall the analogy between real numbers and self-adjoint operators), 
we are led to the result below. 


7.26 Invertible quadratic expressions 


Suppose T € L(V) is self-adjoint and b,c € R are such that b? < 4c. 
Then 
T? +bT +cl 


is invertible. 


Proof Let v be a nonzero vector in V. Then 


(T? +bT +cl1)v,v) = (T™,v) + b(Tv, v) + cv, v) 
= (Tv, Tv) + b(Tv, v) + ellv||? 
> |Tv]? — IbIITv]ilvi] + ellvli? 


= (roy — ÉLED + (e- É)? 
> 0, 


where the third line above holds by the Cauchy—Schwarz Inequality (6.15). 
The last inequality implies that (T? + bT + cl)v Æ 0. Thus T? + bT + cl 
is injective, which implies that it is invertible (see 3.69). E 


We know that every operator, self-adjoint or not, on a finite-dimensional 
nonzero complex vector space has an eigenvalue (see 5.21). Thus the next 
result tells us something new only for real inner product spaces. 
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7.27 Self-adjoint operators have eigenvalues 


Suppose V Æ {0} and T € L(V) is a self-adjoint operator. Then T has 
an eigenvalue. 


Proof We can assume that V is a real inner product space, as we have already 
noted. Let n = dim V and choose v € V with v Æ 0. Then 


v, Tv, T?v,..., T”v 


cannot be linearly independent, because V has dimension n and we have n + 1 
vectors. Thus there exist real numbers ao, ...,4an, not all 0, such that 


0 = aov +aıTv +--+ anT”v. 


Make the a’s the coefficients of a polynomial, which can be written in factored 
form (see 4.17) as 


ao + ax +++» + anx” 


= c(x? + bix + c1) (x? + bmx + cm) & — Aq) ++ (X — Àn), 


where c is a nonzero real number, each b;, cj, and A; is real, each b Fa is less 
than 4c;,m + M > 1, and the equation holds for all real x. We then have 


0 = aov +a1Tv +--+ anT”v 
= (aol +aıT +--+ anT”)v 
= c(T? +bıT +611) (T? +bmT +cmI)(T — 11) --- (T — àmI X. 


By 7.26, each T? + b;T + cjI is invertible. Recall also that c # 0. Thus 
the equation above implies that m > 0 and 


0 = (T = M1) Aad. 


Hence T — å; I is not injective for at least one j. In other words, T has an 
eigenvalue. E 


The next result shows that if U is a subspace of V that is invariant under 
a self-adjoint operator T, then U+ is also invariant under T. Later we will 
show that the hypothesis that T is self-adjoint can be replaced with the weaker 
hypothesis that T is normal (see 9.30). 
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7.28 Self-adjoint operators and invariant subspaces 


Suppose T € L(V) is self-adjoint and U is a subspace of V that is 
invariant under T. Then 


(a) U 1 is invariant under T; 
(b) Tly € L(V) is self-adjoint; 


(c) Tlys € £(U+) is self-adjoint. 


Proof To prove (a), suppose v € UŁ. Let u € U. Then 
(Tv, u) = (v, Tu) = 0, 


where the first equality above holds because T is self-adjoint and the second 
equality above holds because U is invariant under T (and hence Tu € U) 
and because v € Ut. Because the equation above holds for each u € U, we 
conclude that Tv € UŁ. Thus U+ is invariant under T, completing the proof 
of (a). 

To prove (b), note that if u,v € U, then 


((Tlu)u, v) = (Tu, v) = (u, Tv) = (u, (T|u)v). 


Thus T |y is self-adjoint. 
Now (c) follows from replacing U with U + in (b), which makes sense 
by (a). = 


We can now prove the next result, which is one of the major theorems in 


linear algebra. 


7.29 Real Spectral Theorem 

Suppose F = R and T € L(V). Then the following are equivalent: 
(a) T is self-adjoint. 

(b) V has an orthonormal basis consisting of eigenvectors of T. 


(c) T has a diagonal matrix with respect to some orthonormal basis 
of V. 
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Proof First suppose (c) holds, so T has a diagonal matrix with respect to 
some orthonormal basis of V. A diagonal matrix equals its transpose. Hence 
T = T*, and thus T is self-adjoint. In other words, (a) holds. 

We will prove that (a) implies (b) by induction on dim V. To get started, 
note that if dim V = 1, then (a) implies (b). Now assume that dim V > 1 and 
that (a) implies (b) for all real inner product spaces of smaller dimension. 

Suppose (a) holds, so T € L(V) is self-adjoint. Let u be an eigenvector 
of T with ||u|| = 1 (7.27 guarantees that T has an eigenvector, which can 
then be divided by its norm to produce an eigenvector with norm 1). Let 
U = span(u). Then U is a 1-dimensional subspace of V that is invariant 
under T. By 7.28(c), the operator T |y € L(U +H) is self-adjoint. 

By our induction hypothesis, there is an orthonormal basis of U+ consist- 
ing of eigenvectors of T |y. Adjoining u to this orthonormal basis of U 1 
gives an orthonormal basis of V consisting of eigenvectors of T, completing 
the proof that (a) implies (b). 

We have proved that (c) implies (a) and that (a) implies (b). Clearly (b) 
implies (c), completing the proof. E 


7.30 Example Consider the self-adjoint operator T on R? whose matrix 
(with respect to the standard basis) is 


14 -13 8 
-13 14 8 
8 8 -7 


As you can verify, 
G;=1,9 @,11) (,1,-2) 
V2 V3 V6 
is an orthonormal basis of R? consisting of eigenvectors of T, and with respect 
to this basis, the matrix of T is the diagonal matrix 


27 0 0 
0 9 0 
0 0 -15 


If F = C, then the Complex Spectral Theorem gives a complete descrip- 
tion of the normal operators on V. A complete description of the self-adjoint 
operators on V then easily follows (they are the normal operators on V whose 
eigenvalues all are real; see Exercise 6). 

If F = R, then the Real Spectral Theorem gives a complete description 
of the self-adjoint operators on V. In Chapter 9, we will give a complete 
description of the normal operators on V (see 9.34). 
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EXERCISES 7.B 


1 True or false (and give a proof of your answer): There exists T € L(R3) 
such that T is not self-adjoint (with respect to the usual inner product) 
and such that there is a basis of R? consisting of eigenvectors of T. 


2 Suppose that T is a self-adjoint operator on a finite-dimensional inner 
product space and that 2 and 3 are the only eigenvalues of 7. Prove that 
T*-5T +61 =0. 


3 Give an example of an operator T € £(C*) such that 2 and 3 are the 
only eigenvalues of T and T? — 5T + 61 40. 


4 Suppose F = C and T € L(V). Prove that T is normal if and only if 
all pairs of eigenvectors corresponding to distinct eigenvalues of T are 
orthogonal and 


V=E(A1,T)8-:-6 EQm,T), 
where A1,..., Am denote the distinct eigenvalues of T. 


5 Suppose F = R and T € L(V). Prove that T is self-adjoint if and only 
if all pairs of eigenvectors corresponding to distinct eigenvalues of T are 
orthogonal and 


V = E(A1,T) @---® E(\m, T), 
where A1,...,Am denote the distinct eigenvalues of T. 


6 Prove that a normal operator on a complex inner product space is self- 
adjoint if and only if all its eigenvalues are real. 
[The exercise above strengthens the analogy (for normal operators) 
between self-adjoint operators and real numbers. | 


7 Suppose V is a complex inner product space and T € £(V) is anormal 
operator such that T? = T8. Prove that T is self-adjoint and T? = T. 


8 Give an example of an operator T on a complex vector space such that 
T? = TÈ but T? AT. 


9 Suppose V is a complex inner product space. Prove that every normal 
operator on V has a square root. (An operator § € L(V) is called a 
square root of T € L(V) if S? = T.) 
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Give an example of a real inner product space V and T € L(V) and real 
numbers b,c with b? < 4c such that T? + bT + cI is not invertible. 
[The exercise above shows that the hypothesis that T is self-adjoint is 
needed in 7.26, even for real vector spaces. | 


Prove or give a counterexample: every self-adjoint operator on V has a 
cube root. (An operator S € L(V) is called a cube root of T € L(V) if 
S? =T) 


Suppose T € L(V) is self-adjoint, A € F, and € > 0. Suppose there 
exists v € V such that ||v|| = 1 and 


|Tv — àv]|| < €. 
Prove that T has an eigenvalue A’ such that |A — A’| < e. 


Give an alternative proof of the Complex Spectral Theorem that avoids 
Schur’s Theorem and instead follows the pattern of the proof of the Real 
Spectral Theorem. 


Suppose U is a finite-dimensional real vector space and T € L(U). 
Prove that U has a basis consisting of eigenvectors of T if and only if 
there is an inner product on U that makes T into a self-adjoint operator. 


Find the matrix entry below that is covered up. 


keep away from him... 
he's not normal! 
(oN 


SECTION 7.C Positive Operators and Isometries 225 


7.C | Positive Operators and Isometries 


Positive Operators 


7.31 Definition positive operator 

An operator T € L(V) is called positive if T is self-adjoint and 
(Tv,v) => 0 

forall v € V. 


If V is a complex vector space, then the requirement that T is self-adjoint 
can be dropped from the definition above (by 7.15). 


7.32 Example positive operators 


(a) IfU is a subspace of V, then the orthogonal projection Py is a positive 
operator, as you should verify. 


(b) IfT € L(V) is self-adjoint and b,c € R are such that b? < 4c, then 
T? +bT + cl is a positive operator, as shown by the proof of 7.26. 


7.33 Definition square root 


An operator R is called a square root of an operator T if R? = T. 


7.34 Example IfT e L(F?) is defined by T(z1,Z2,z3) = (z3,0,0), 
then the operator R € L(F?) defined by R(z1,22,23) = (Z2, Z3,0) is a 
square root of T. 


The characterizations of the positive [The positive operators correspond Ñ 
operators in the next result correspond | to the numbers [0,00), so better 
to characterizations of the nonnegative | terminology would use the term 
numbers among C. Specifically, acom- | /”onnegative instead of positive. 


plex number z is nonnegative if and However, operator theorists consis- 
tently call these the positive opera- 


only if it has a nonnegative square root, : 
WS so we will follow that custom. 


corresponding to condition (c). Also, 
z is nonnegative if and only if it has a real square root, corresponding to 
condition (d). Finally, z is nonnegative if and only if there exists a complex 
number w such that z = ww, corresponding to condition (e). 
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7.35 Characterization of positive operators 

Let T € L(V). Then the following are equivalent: 

(a) T is positive; 

(b) T is self-adjoint and all the eigenvalues of T are nonnegative; 
(c) T has a positive square root; 

(d) T has a self-adjoint square root; 


(e) there exists an operator R € L(V) such that T = R*R. 


Proof We will prove that (a) > (b) > (c) > (d) > (e) => (a). 

First suppose (a) holds, so that T is positive. Obviously T is self-adjoint 
(by the definition of a positive operator). To prove the other condition in (b), 
suppose A is an eigenvalue of T. Let v be an eigenvector of T corresponding 
to A. Then 

O=(Ty,v) = (àv, v) = A (v, v). 


Thus A is a nonnegative number. Hence (b) holds. 

Now suppose (b) holds, so that T is self-adjoint and all the eigenvalues 
of T are nonnegative. By the Spectral Theorem (7.24 and 7.29), there is 
an orthonormal basis e1,...,e€n of V consisting of eigenvectors of T. Let 
A1,--.,An be the eigenvalues of T corresponding to ej,..., €n,; thus each 
A; is a nonnegative number. Let R be the linear map from V to Vsuch that 


Re; = af Ajej 


for j =1,..., n (see 3.5). Then R is a positive operator, as you should verify. 
Furthermore, R?e; = 4;e; = Te; for each j, which implies that R? = T. 
Thus R is a positive square root of T. Hence (c) holds. 

Clearly (c) implies (d) (because, by definition, every positive operator is 
self-adjoint). 

Now suppose (d) holds, meaning that there exists a self-adjoint operator 
Ron V such that T = R?. Then T = R*R (because R* = R). Hence (e) 
holds. 

Finally, suppose (e) holds. Let R € L(V) be such that T = R* R. Then 
T* = (R*R)* = R*(R*)* = R*R = T. Hence T is self-adjoint. To 
complete the proof that (a) holds, note that 


(Tv, v} = (R* Rv, v) = (Rv, Rv) > 0 


for every v € V. Thus T is positive. E 
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Each nonnegative number has a [Some mathematicians also use the 
unique nonnegative square root. The | term positive semidefinite opera- 
next result shows that positive operators | tor, which means the same as posi- 
enjoy a similar property. tive operator. 


7.36 Each positive operator has only one positive square root 


Every positive operator on V has a unique positive square root. 


Proof Suppose T € L(V) is positive. [4 positive operator can have in- 
Suppose v € V is an eigenvector of T. | finitely many square roots (al- 
Thus there exists A > 0 such that Tv = | though only one of them can be 
Àv. positive). For example, the identity 

Let R be a positive square root of T. | operator on V has infinitely many 
We will prove that Rv = ne This (Seen if dim V > 1. 
will imply that the behavior of R on the eigenvectors of T is uniquely deter- 
mined. Because there is a basis of V consisting of eigenvectors of T (by the 
Spectral Theorem), this will imply that R is uniquely determined. 

To prove that Rv = VAv, note that the Spectral Theorem asserts that 
there is an orthonormal basis e1,...,e, of V consisting of eigenvectors of R. 
Because R is a positive operator, all its eigenvalues are nonnegative. Thus 
there exist nonnegative numbers Aj,...,An such that Re; = VETI ej for 
J=l,...,n. 

Because €1,...,€, is a basis of V, we can write 


v = 41€1 +: + anen 
for some numbers a1,...,da, € F. Thus 


Rv = ai Vħer +--+ + any Anen 


and hence 
R?» = ayAje1 +-+- + GnAnen- 
But R? = T, and Tv = Av. Thus the equation above implies 
ajer + +++ + anhen = ayAyey + +++ + Andnen. 


The equation above implies thata; (à —A;) = O for j = 1,...,n. Hence 
v= ý ajep 
{jiaj=A} 
and thus 
Rv= 5 ajv hej = Vv, 
{J: Àj =A} 


as desired. " 
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Isometries 


Operators that preserve norms are sufficiently important to deserve a name: 


7.37 Definition isometry 


e An operator S € L(V) is called an isometry if 
[Sv] = [lvl 
forall v € V. 


e In other words, an operator is an isometry if it preserves norms. 


The Greek word isos means equal; | For example, AJ is an isometry 
the Greek word metron means| Whenever À € F satisfies |A| = 1. We 
measure. Thus isometry literally | will see soon that if F = C, then the 
means equal measure. ) next example includes all isometries. 

7.38 Example Suppose A1,...,A, are scalars with absolute value 1 and 
S € L(V) satisfies Se; = À je; for some orthonormal basis e1,..., en of V. 


Show that S' is an isometry. 


Solution Suppose v € V. Then 


7.39 v = (v,e1)ey +--+ + (v, en en 
and 
7.40 Ivl? = |v, e1)? +--+ Iv, en) 7, 


where we have used 6.30. Applying S to both sides of 7.39 gives 


Sv = (v,e1)Se, +--+ (v, en) Sen 
= Ài (v, e1)ei +--+ An, en)en. 


The last equation, along with the equation |A ;| = 1, shows that 
7.41 [| Svl|? = [{v, e1)? +--+ + [(v, en)”. 


Comparing 7.40 and 7.41 shows that ||v|| = ||.Sv||. In other words, S is an 
isometry. 
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The next result provides several con- [4n isometry on a real inner product 
ditions that are equivalent to being an | space is often called an orthogonal 
isometry. The equivalence of (a) and (b) | operator. An isometry on a com- 
shows that an operator is an isometry if | plex inner product space is often 
and only if it preserves inner products. | C@led a unuary operator: We use 
The equivalence of (a) and (c) [or (d)] the term isometry so that our re- 

: fi sults can apply to both real and 
shows that an operator is an isometry . 
. . : i complex inner product spaces. 
if and only if the list of columns of its 
matrix with respect to every [or some] basis is orthonormal. Exercise 10 
implies that in the previous sentence we can replace “columns” with “rows”. 


7.42 Characterization of isometries 


Suppose S € L(V). Then the following are equivalent: 


(a) S is an isometry; 


(b) (Su, Sv) = (u,v) forall u,v € V; 


(c) Se1,..., Sen is orthonormal for every orthonormal list of vectors 
ZlooconGn iim Ws 

(d) there exists an orthonormal basis e;,...,e, of V such that 
Se,,..., Sen is orthonormal; 

@ SS S 

@ SS*=TJ; 


(g) S* is an isometry; 


(h) Sis invertible and ST! = S*. 


Proof First suppose (a) holds, so S is an isometry. Exercises 19 and 20 in 
Section 6.A show that inner products can be computed from norms. Because 
S preserves norms, this implies that S preserves inner products, and hence 
(b) holds. More precisely, if V is a real inner product space, then for every 
u,v € V we have 
(Su, Sv) = (Su + Sv||? — ||Su— Sv||?)/4 
= (Su +V)? = 1S —v)I)/4 
= (llu + v|? — llu — vl’) /4 


= (u,v), 
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where the first equality comes from Exercise 19 in Section 6.A, the second 
equality comes from the linearity of S, the third equality holds because S is an 
isometry, and the last equality again comes from Exercise 19 in Section 6.A. 
If V is a complex inner product space, then use Exercise 20 in Section 6.A 
instead of Exercise 19 to obtain the same conclusion. In either case, we see 
that (b) holds. 

Now suppose (b) holds, so S preserves inner products. Suppose that 
(A C en is an orthonormal list of vectors in V. Then we see that the list 
Sei, ..., Sen is orthonormal because (Se;, Seg) = (ej, ep). Thus (c) holds. 

Clearly (c) implies (d). 

Now suppose (d) holds. Let e1, ..., en be an orthonormal basis of V such 
that Se1,..., Se, is orthonormal. Thus 


(S*Se;, ex) = (ej ek) 


for j,k = 1,..., n [because the term on the left equals (Sej, Sex) and 
(Se;,..., Sen) is orthonormal]. All vectors u,v € V can be written as 
linear combinations of e1, ..., en, and thus the equation above implies that 
(S*Su,v) = (u,v). Hence S*S = I; in other words, (e) holds. 

Now suppose (e) holds, so that $*S = I. In general, an operator S need 
not commute with S*. However, S*S = I if and only if SS* = I; this is a 
special case of Exercise 10 in Section 3.D. Thus SS* = J, showing that (£) 
holds. 

Now suppose (f) holds, so SS* = I. If v € V, then 


|| S*v|]? = (S*v, S*v) = (SS*v,v) = (v, v) = Ivl’. 


Thus S* is an isometry, showing that (g) holds. 

Now suppose (g) holds, so S* is an isometry. We know that (a) = (e) and 
(a) = (f) because we have shown (a) => (b) => (c) > (d) > (e) > (f). Using 
the implications (a) = (e) and (a) = (f) but with S replaced with S* [and 
using the equation (S$*)* = S], we conclude that SS* = I and S*S = I. 
Thus S is invertible and S~! = S*; in other words, (h) holds. 

Now suppose (h) holds, so S$ is invertible and S~! = S*. Thus S*S = I. 
Ifv € V, then 


\| Sv||? = (Sv, Sv) = (S* Sv, v) = (v, v) = Ivl]? 


Thus S is an isometry, showing that (a) holds. 
We have shown (a) > (b) > (c) > (d) > (e) > ®© > (e) > (b) > (a), 
completing the proof. E 
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The previous result shows that every isometry is normal [see (a), (e), and 
(f) of 7.42]. Thus characterizations of normal operators can be used to give 
descriptions of isometries. We do this in the next result in the complex case 
and in Chapter 9 in the real case (see 9.36). 


7.43 Description of isometries when F = C 


Suppose V is a complex inner product space and S$ € L(V). Then the 
following are equivalent: 


(a) S is an isometry. 


(b) There is an orthonormal basis of V consisting of eigenvectors of S 
whose corresponding eigenvalues all have absolute value 1. 


Proof We have already shown (see Example 7.38) that (b) implies (a). 

To prove the other direction, suppose (a) holds, so S is an isometry. By the 
Complex Spectral Theorem (7.24), there is an orthonormal basis e1,..., en 
of V consisting of eigenvectors of S. For j € {1,...,n}, let A; be the 
eigenvalue corresponding to e;. Then 


|Aj| = |lAjzeyll = ISe; = lle; = 1. 


Thus each eigenvalue of S has absolute value 1, completing the proof. a 


EXERCISES 7:C 


1 Prove or give a counterexample: If T € L(V) is self-adjoint and there 
exists an orthonormal basis e;,..., en of V such that (Te je j > 0 for 
each j, then T is a positive operator. 


2 Suppose T is a positive operator on V. Suppose v, w € V are such that 
Tv=w and Tw=v. 
Prove that v = w. 


3 Suppose T is a positive operator on V and U is a subspace of V invariant 
under T. Prove that T|y € £(U) is a positive operator on U. 


4 Suppose T € L(V, W). Prove that T*T is a positive operator on V and 
T T* is a positive operator on W. 
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Prove that the sum of two positive operators on V is positive. 


Suppose T € L(V) is positive. Prove that TÝ is positive for every 
positive integer k. 


Suppose T is a positive operator on V. Prove that T is invertible if and 
only if 
(Tv,v) >0 


for every v € V with v Æ 0. 
Suppose T € L(V). For u,v € V, define (u, v}r by 
(u,v)7 = (Tu,v). 


Prove that (-,-) 7 is an inner product on V if and only if T is an invertible 
positive operator (with respect to the original inner product (-, -)). 


Prove or disprove: the identity operator on F? has infinitely many self- 
adjoint square roots. 


Suppose S € L(V). Prove that the following are equivalent: 


(a) S is an isometry; 
(b) (S*u,S*v) = (u,v) forallu,v € V; 


(c) S*e1,..., S* em is an orthonormal list for every orthonormal list 
of vectors €1,..., emin V; 

(d) S*e1,..., S*en is an orthonormal basis for some orthonormal 
basis e1, ..., en of V. 


Suppose 7), T> are normal operators on £ (F?) and both operators have 
2,5,7 as eigenvalues. Prove that there exists an isometry S € L(F?) 
such that Ti = S* T25. 


Give an example of two self-adjoint operators T1, T2 € £(F*) such that 
the eigenvalues of both operators are 2,5, 7 but there does not exist an 
isometry S € L(F*) such that Tı = S*T2S. Be sure to explain why 
there is no isometry with the required property. 


Prove or give a counterexample: if S € £(V) and there exists an ortho- 
normal basis e1,...,@, of V such that ||Se;|| = 1 for each ej, then S 
is an isometry. 


Let T be the second derivative operator in Exercise 21 in Section 7.A. 
Show that —T is a positive operator. 
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7.D Polar Decomposition and Singular 
Value Decomposition 


Polar Decomposition 


Recall our analogy between C and £(V). Under this analogy, a complex 
number z corresponds to an operator T, and Z corresponds to T*. The real 
numbers (z = Z) correspond to the self-adjoint operators (T = T*), and the 
nonnegative numbers correspond to the (badly named) positive operators. 

Another distinguished subset of C is the unit circle, which consists of the 
complex numbers z such that |z| = 1. The condition |z| = 1 is equivalent 
to the condition Zz = 1. Under our analogy, this would correspond to the 
condition T*T = J, which is equivalent to T being an isometry (see 7.42). 
In other words, the unit circle in C corresponds to the isometries. 

Continuing with our analogy, note that each complex number z except 0 
can be written in the form 


-- (Bet (GE 


|z| |z 


where the first factor, namely, z/|z|, is an element of the unit circle. Our 
analogy leads us to guess that each operator T € L(V) can be written as an 
isometry times V T*T. That guess is indeed correct, as we now prove after 
defining the obvious notation, which is justified by 7.36. 


7.44 Notation vT 
If T is a positive operator, then vT denotes the unique positive square 


root of T. 


Now we can state and prove the Polar Decomposition, which gives a 
beautiful description of an arbitrary operator on V. Note that T*T is a 
positive operator for every T € L(V), and thus V7 *T is well defined. 


7.45 Polar Decomposition 


Suppose T € L(V). Then there exists an isometry S € L(V) such that 
t= SWI. 
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Proof Ifv € V, then 


| Tvl? = (Tv, Tv) = 


T*Tv, VT*Tv) 
= |iVT*Tv||?. 


Thus 
7.46 |Tv|| = |v T*Tvl| 


for all v € V. 
Define a linear map S;: range VT*T — range T by 


7.47 Sı(VT*Tv) = Tv. 


The idea of the proof is to extend Sı to an isometry S € L(V) such that 
T = SVT*T. Now for the details. 

First we must check that S1 is well defined. To do this, suppose v1, v2 € V 
are such that VT*Tvı = VT*T vp. For the definition given by 7.47 to make 
sense, we must show that Tvı = Tv2. Note that 


Tvi — Tv2|| = |TO1 — va) 
= ||VT*T (vı — v2)|| 
=> Iv T*Tvı =A T*Tv2|| 
= 0, 


where the second equality holds by 7.46. The equation above shows that 
Tv, = Tv2, so S is indeed well defined. You should verify that S4 is a 
linear map. 

We see from 7.47 that Sı maps range V T*T onto range T. Clearly 7.46 
and 7.47 imply that 


|| Siu] = [ul 
for all u € range VT*T. 
The rest of the proof extends Sı to | In particular, Sı is injective. Thus 
an isometry S on all of V. from the Fundamental Theorem of Lin- 


= ear Maps (3.22), applied to S1, we have 


dim range VT*T = dim range T. 
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This implies that dim(range /7*7)+ = dim(rangeT)+ (see 6.50). 
Thus orthonormal bases e1,..., em of (range /T*T)+ and fi,..., Ím 
of (range T7)+ can be chosen; the key point here is that these two ortho- 
normal bases have the same length (denoted m). Now define a linear map 


Sz: (range v T*T)+ — (range Ty by 
So(ayey +++» + amem) = ai fi +++: + am fn. 


For all w € (range /T*T)+, we have ||S2w|| = ||w|| (from 6.25). 

Now let S be the operator on V that equals Sı on range VT*T and equals 
S2 on (range /T*T)+. More precisely, recall that each v € V can be written 
uniquely in the form 


7.48 v=u +w, 


where u € range /T*T and w € (range /T*T)+ (see 6.47). For v € V 
with decomposition as above, define Sv by 


Sv = Siu + Sow. 
For each v € V we have 
S(VT*Tv) = Sı(VT*Tv) = Ty, 


so T = SVT*T, as desired. All that remains is to show that S is an isometry. 
However, this follows easily from two uses of the Pythagorean Theorem: if 
v € V has decomposition as in 7.48, then 


|| Sv]? = Siu + Sow]? = [Syl]? + |Sowll? = llul? + Iwl? = Ivll?: 


the second equality holds because Su € range T and Syw € (range T)+. m 


The Polar Decomposition (7.45) states that each operator on V is the 
product of an isometry and a positive operator. Thus we can write each 
operator on V as the product of two operators, each of which comes from 
a class that we can completely describe and that we understand reasonably 
well. The isometries are described by 7.43 and 9.36; the positive operators 
are described by the Spectral Theorem (7.24 and 7.29). 

Specifically, consider the case F = C, and suppose T = SVT*T isa 
Polar Decomposition of an operator T € L(V), where S is an isometry. Then 
there is an orthonormal basis of V with respect to which S has a diagonal 
matrix, and there is an orthonormal basis of V with respect to which V T*T 
has a diagonal matrix. Warning: there may not exist an orthonormal basis 
that simultaneously puts the matrices of both S and /7T*T into these nice 
diagonal forms. In other words, S may require one orthonormal basis and 
VT*T may require a different orthonormal basis. 
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Singular Value Decomposition 


The eigenvalues of an operator tell us something about the behavior of the 
operator. Another collection of numbers, called the singular values, is also 
useful. Recall that eigenspaces and the notation F are defined in 5.36. 


7.49 Definition singular values 


Suppose T €e L(V). The singular values of T are the eigenvalues 
of VT*T, with each eigenvalue À repeated dim E(A, VT*T) times. 


The singular values of T are all nonnegative, because they are the eigen- 
values of the positive operator VT*T. 


7.50 Example Define T € £(F*) by 
T(z 4225235 Z4) = (0, 371, 229, —3z4). 
Find the singular values of T. 


Solution A calculation shows T*T (21, 22, 23,24) = (921, 422, 0, 924), as 
you should verify. Thus 


WV T*T (2, 22; 23,24) = (321, 222, 0, 3z4), 
and we see that the eigenvalues of VT*T are 3, 2,0 and 
dim E(3, VT*T) = 2, dim E(2, VT*T) = 1, dim E(0, VT*T) = 1. 


Hence the singular values of T are 3, 3,2, 0. 

Note that —3 and 0 are the only eigenvalues of 7. Thus in this case, the 
collection of eigenvalues did not pick up the number 2 that appears in the 
definition (and hence the behavior) of T, but the collection of singular values 
does include 2. 


Each T € L(V) has dim V singular values, as can be seen by applying 
the Spectral Theorem and 5.41 [see especially part (e)] to the positive (hence 
self-adjoint) operator VT*T. For example, the operator T defined in Exam- 
ple 7.50 on the four-dimensional vector space F4 has four singular values 
(they are 3,3, 2,0), as we saw above. 

The next result shows that every operator on V has a clean description in 
terms of its singular values and two orthonormal bases of V. 
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7.51 Singular Value Decomposition 


Suppose T €e L(V) has singular values 51,...,5,. Then there exist 
orthonormal bases e1,..., en and fi, ..., fn of V such that 


y= sı (y, efi Fae Sn (V, en) fn 


for every v € V. 


Proof By the Spectral Theorem applied to vV T*T , there is an orthonormal 
basis €1,..., en of V such that VT*Te; = sje; for j =1,...,n. 
We have 
v = (v,e1)e1 + + (V, en)en 
for every v € V (see 6.30). Apply VT*T to both sides of this equation, 
getting 
VT*Tv = s1(v,e1)e1 +--+ + 5n(v, en)en 
for every v € V. By the Polar Decomposition (see 7.45), there is an isometry 
S € L(V) such that T = SVT*T. Apply S to both sides of the equation 
above, getting 


Tv = sı (v,e1)Sey +--+ + Sy (v, en) Sen 


for every v € V. For each j, let fj = Sej. Because S is an isometry, 
fi,- --, fn is an orthonormal basis of V (see 7.42). The equation above now 
becomes 


Tv= si(y, enh ss Sn, en) fn 
for every v € V, completing the proof. E 


When we worked with linear maps from one vector space to a second 
vector space, we considered the matrix of a linear map with respect to a basis 
of the first vector space and a basis of the second vector space. When dealing 
with operators, which are linear maps from a vector space to itself, we almost 
always use only one basis, making it play both roles. 

The Singular Value Decomposition allows us a rare opportunity to make 
good use of two different bases for the matrix of an operator. To do this, 
suppose T € L(V). Let s1,..., Sn denote the singular values of T, and let 
€1,...,@, and fi,..., fn be orthonormal bases of V such that the Singular 
Value Decomposition 7.51 holds. Because Te; = s; fj for each j, we have 


Sy 0 
MT, Cignn sta) (fiss Ja) = 
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In other words, every operator on V has a diagonal matrix with respect 
to some orthonormal bases of V, provided that we are permitted to use two 
different bases rather than a single basis as customary when working with 
operators. 

Singular values and the Singular Value Decomposition have many applica- 
tions (some are given in the exercises), including applications in computational 
linear algebra. To compute numeric approximations to the singular values of 
an operator T, first compute T*T and then compute approximations to the 
eigenvalues of T*T (good techniques exist for approximating eigenvalues 
of positive operators). The nonnegative square roots of these (approximate) 
eigenvalues of T*T will be the (approximate) singular values of T. In other 
words, the singular values of T can be approximated without computing the 
square root of T*T. The next result helps justify working with T*T instead 


of VT*T. 


7.52 Singular values without taking square root of an operator 


Suppose T € L(V). Then the singular values of T are the nonnegative 
square roots of the eigenvalues of T*T, with each eigenvalue À repeated 
dim E(A, T*T) times. 


Proof The Spectral Theorem implies that there are an orthonormal basis 


Cls.ees en and nonnegative numbers À1,..., An such that T*Te; = A;e; 
for j = 1,...,n. Itis easy to see that VT*Te; = ./Aje; for j = 1,...,n, 
which implies the desired result. E 


EXERCISES 7.D 


1 Fixu,x €V with u Æ 0. Define T € L(V) by 
Tv = (v, ux 


for every v € V. Prove that 


VT*Ty = Ix] (v, uju 


llul 
for every v € V. 


2 Give an example of T € L(C?) such that 0 is the only eigenvalue of T 
and the singular values of T are 5, 0. 
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3 


13 


14 


15 


Suppose T € L(V). Prove that there exists an isometry S € L(V) such 


that 
T=/vVTT*S. 


Suppose T € £(V) and s is a singular value of T. Prove that there exists 
a vector v € V such that ||v|| = 1 and ||Tv|| = s. 


Suppose T € L(C7) is defined by T(x, y) = (—4y, x). Find the singu- 
lar values of T. 


Find the singular values of the differentiation operator D € P(R?) 
defined by Dp = p’, where the inner product on P (R?) is as in Example 
6.33. 


Define T € L(F?) by 
T(z 9225 23) = (23, 221, 322). 
Find (explicitly) an isometry S € £(F+) such that T = S/T*T. 


Suppose T € L(V), S € L(V) is an isometry, and R € L(V) isa 
positive operator such that T = SR. Prove that R= /T*T. 

[The exercise above shows that if we write T as the product of an isometry 
and a positive operator (as in the Polar Decomposition 7.45), then the 
positive operator equals /T*T.] 


Suppose T € L(V). Prove that T is invertible if and only if there exists 
a unique isometry S € L(V) such that T = SVT*T. 


Suppose T € L(V) is self-adjoint. Prove that the singular values of T 
equal the absolute values of the eigenvalues of T, repeated appropriately. 


Suppose T € L(V). Prove that T and T* have the same singular values. 


Prove or give a counterexample: if T € L(V), then the singular values 
of T? equal the squares of the singular values of T. 


Suppose T € L(V). Prove that T is invertible if and only if 0 is not a 
singular value of 7. 


Suppose T €e L(V). Prove that dimrange T equals the number of 
nonzero singular values of T. 


Suppose S € L(V). Prove that S is an isometry if and only if all the 
singular values of S equal 1. 
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Suppose T1, T2 E€ L(V). Prove that T; and Tz have the same singular 
values if and only if there exist isometries S1, S2 € L(V) such that 
Ti = S,T2S2. 


Suppose T € £(V) has singular value decomposition given by 
Tv = sı (v, e1) fi + +++ + Sn(v, en) fn 


for every v € V, where sj,...,5, are the singular values of T and 
€1,...,e€, and fi,..., fn are orthonormal bases of V. 


(a) Prove that if v € V, then 
T*v = sı (v, filer +++» + 5n(v, fnden- 
(b) Prove that if v € V, then 
T*Tv = s14” (v, e1)e1 +--+ Sn (V, en)en. 
(c) Prove that if v € V, then 
VT*Ty = s1 (v, e1)e1 ++: + Sn (V, en}en. 
(d) Suppose T is invertible. Prove that if v € V, then 


Toy = fler |, q WY fnden 
S1 Sn 
for every v € V. 


Suppose T € L(V). Let $ denote the smallest singular value of T, and 
let s denote the largest singular value of T. 


(a) Prove that s||v|| < ||Tv]| < s||v|| for every v € V. 


(b) Suppose A is an eigenvalue of T. Prove that $ < |A| < s. 


Suppose T € L(V). Show that T is uniformly continuous with respect 
to the metric d on V defined by d (u, v) = ||u — v||. 


Suppose S,T €e L(V). Let s denote the largest singular value of S, 
let ¢ denote the largest singular value of T, and let r denote the largest 
singular value of S + T. Prove thatr < s + t. 


CHAPTER 


Hypatia, the 5"" century Egyptian 
mathematician and philosopher, as 
envisioned around 1900 by Alfred 
Seifert. 


Operators on Complex 
Vector Spaces 


In this chapter we delve deeper into the structure of operators, with most of 
the attention on complex vector spaces. An inner product does not help with 
this material, so we return to the general setting of a finite-dimensional vector 
space. To avoid some trivialities, we will assume that V 4 {0}. Thus our 
assumptions for this chapter are as follows: 


8.1 Notation F, V 
e F denotes R or C. 


e V denotes a finite-dimensional nonzero vector space over F. 


LEARNING OBJECTIVES FOR THIS CHAPTER 
m generalized eigenvectors and generalized eigenspaces 


characteristic polynomial and the Cayley-Hamilton Theorem 


m decomposition of an operator 
m minimal polynomial 
m Jordan Form 
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8.A | Generalized Eigenvectors and Nilpotent 
Operators 


Null Spaces of Powers of an Operator 


We begin this chapter with a study of null spaces of powers of an operator. 


8.2 Sequence of increasing null spaces 
Suppose T € L(V). Then 


{0} = null T? C null T! C --- C null TE Cull T% c..., 


Proof Suppose k is a nonnegative integer and v € null T*. Then T*v = 0, 
and hence T*t+1y = T(T*v) = T(0) = 0. Thus v € null T*+!. Hence 
null TĚ C null T* +1 as desired. C] 


The next result says that if two consecutive terms in this sequence of 
subspaces are equal, then all later terms in the sequence are equal. 


8.3 Equality in the sequence of null spaces 


Suppose T € L(V). Suppose m is a nonnegative integer such that 
null T” = null T”*!. Then 


null T” = null T”! = null T”? = mll T”? =... 


Proof Let k be a positive integer. We want to prove that 
null T”TE = null T”, 


We already know from 8.2 that null T” TE C null T™ tA, 
To prove the inclusion in the other direction, suppose v € null T’+#+1, 
Then 
pele) = Tmtkt+i,, =. 


Hence 
T*y € null T”+! = null T”. 


Thus T”+tky = T”(T*v) = 0, which means that v € nullT”+*. This 
implies that null 7”+**! C null T”*+*, completing the proof. m 
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The proposition above raises the question of whether there exists a non- 
negative integer m such that null 7” = null 7”*!. The proposition below 
shows that this equality holds at least when m equals the dimension of the 
vector space on which T operates. 


8.4 Null spaces stop growing 
Suppose T € L(V). Let n = dim V. Then 
null T” = null T”! ag Mee =... 
Proof We need only prove that null T” = null T"*! (by 8.3). Suppose this 
is not true. Then, by 8.2 and 8.3, we have 
{0} = null T° Ç null T! ¢--- G null T” G null T”*!, 


where the symbol ¢ means “contained in but not equal to”. At each of the 
strict inclusions in the chain above, the dimension increases by at least 1. 
Thus dim null T”+! > n + 1, a contradiction because a subspace of V cannot 
have a larger dimension than n. E 


Unfortunately, it is not true that V = null T @range T for each T € L(V). 
However, the following result is a useful substitute. 


8.5 V is the direct sum of null T#™®V and range T#™V 
Suppose T € L(V). Let n = dim V. Then 


V = null T” @rangeT”. 


Proof First we show that 
8.6 (null T”) N (range T”) = {0}. 


Suppose v € (null T”) N (range T”). Then T”v = 0, and there exists u € V 
such that v = T” u. Applying T” to both sides of the last equation shows that 
T”v = T?”u. Hence T?”u = 0, which implies that T”u = 0 (by 8.4). Thus 
v = T”u = 0, completing the proof of 8.6. 

Now 8.6 implies that null T” + range T” is a direct sum (by 1.45). Also, 


dim(null T” @ range T”) = dim null T” + dim range T” = dim V, 


where the first equality above comes from 3.78 and the second equality comes 
from the Fundamental Theorem of Linear Maps (3.22). The equation above 
implies that null T” @ range T” = V, as desired. m 
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8.7 Example Suppose T € £(F?) is defined by 
T (21, Z2, 23) = (422, 0, 5z3). 


For this operator, null T + range T is not a direct sum of subspaces, because 
nullT = {(z1,0,0) : zı € F} and range T = {(z1,0, z3) : 21,23 € F}. 
Thus null T N range T Æ {0} and hence null T + range T is not a direct sum. 
Also note that null T + range T Æ F?. 

However, we have T? (z1, Z2, Z3) = (0,0, 125z3). Thus we see that 
null T? = {(z1,22,0) : Z1, Z2 € F} and range T? = {(0,0, z3) : z3 € F}. 
Hence F? = null T? @ range T?. 


Generalized Eigenvectors 


Unfortunately, some operators do not have enough eigenvectors to lead to 
a good description. Thus in this subsection we introduce the concept of 
generalized eigenvectors, which will play a major role in our description of 
the structure of an operator. 

To understand why we need more than eigenvectors, let’s examine the 
question of describing an operator by decomposing its domain into invariant 
subspaces. Fix T € L(V). We seek to describe T by finding a “nice” direct 
sum decomposition 

V = U1 ®--- ® Un, 


where each U; is a subspace of V invariant under T. The simplest possible 
nonzero invariant subspaces are 1-dimensional. A decomposition as above 
where each U; is a 1-dimensional subspace of V invariant under T is possible 
if and only if V has a basis consisting of eigenvectors of T (see 5.41). This 
happens if and only if V has an eigenspace decomposition 


8.8 V = E(A1,T) @-:-® E(\m,T), 


where Aj,..., Am are the distinct eigenvalues of T (see 5.41). 

The Spectral Theorem in the previous chapter shows that if V is an inner 
product space, then a decomposition of the form 8.8 holds for every normal 
operator if F = C and for every self-adjoint operator if F = R because 
operators of those types have enough eigenvectors to form a basis of V (see 
7.24 and 7.29). 
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Sadly, a decomposition of the form 8.8 may not hold for more general oper- 
ators, even on a complex vector space. An example was given by the operator 
in 5.43, which does not have enough eigenvectors for 8.8 to hold. General- 
ized eigenvectors and generalized eigenspaces, which we now introduce, will 
remedy this situation. 


8.9 Definition generalized eigenvector 


Suppose T € L(V) and J is an eigenvalue of T. A vector v € V is called 
a generalized eigenvector of T corresponding to A if v Æ 0 and 


(T —Al)/v=0 


for some positive integer j. 


Although j is allowed to be an arbi- [Note that we do not define the con-¥ 


trary integer in the equation cept of a generalized eigenvalue, 
: because this would not lead to any- 
(T —Al)iv =0 thing new. Reason: if (T —AI)/ is 


not injective for some positive inte- 
in the definition of a generalized eigen- | ger j, then T — XI is not injective, 


vector, we will soon prove that every | and hence À is an eigenvalue of T. 
generalized eigenvector satisfies this ` 
equation with j = dim V. 


8.10 Definition generalized eigenspace, G(A,T) 


Suppose T € L(V) and à € F. The generalized eigenspace of T corre- 
sponding to A, denoted G(A, T), is defined to be the set of all generalized 
eigenvectors of T corresponding to A, along with the 0 vector. 


Because every eigenvector of T is a generalized eigenvector of T (take 
j = 1 in the definition of generalized eigenvector), each eigenspace is 
contained in the corresponding generalized eigenspace. In other words, if 
T € L(V) and A € F, then 


E(à,T) CCG): 


The next result implies that if T € L(V) and A € F, then G(A, 7) isa 
subspace of V (because the null space of each linear map on V is a subspace 
of V). 
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8.11 Description of generalized eigenspaces 
Suppose T € L(V) and A € F. Then G(A, T) = null(T — AL)", 


Proof Suppose v € null(T — A/)“™". The definitions imply v € G(A, T). 
Thus G(A, T) D null(T — Ae". 
Conversely, suppose v € G(A, T). Thus there is a positive integer j such 
that 
v €null(T —Al)/. 


From 8.2 and 8.4 (with T — AJ replacing T), we get v € null(T — A 7)®™™ Y. 
Thus G(A, T) C null(T — AJ)“™, completing the proof. a 


8.12 Example Define T € L(C?) by 
T (21, 22,23) = (422, 0, 523). 


(a) Find all eigenvalues of T, the corresponding eigenspaces, and the 
corresponding generalized eigenspaces. 


(b) Show that C? is the direct sum of generalized eigenspaces correspond- 
ing to the distinct eigenvalues of T. 


Solution 


(a) A routine use of the definition of eigenvalue shows that the eigenvalues 
of T are 0 and 5. The corresponding eigenspaces are easily seen to be 
E(0, T) = {(21, 0,0) : zı E€ C} and E(5, T) = {(0, 0, 23) : z3 € C}. 


Note that this operator T does not have enough eigenvectors to span its 
domain C3. 


We have T3(z1, 22,23) = (0,0, 12523) for all z1, 22,23 € C. Thus 
8.11 implies that G(0, T) = {(21, 22,0) : 21, Z2 E€ C}. 


We have (T —5/)?(z1, 22, 23) = (—125z, + 30022, —125z2, 0). Thus 
8.11 implies that G(5, T) = {(0, 0, 23) : z3 € C}. 


(b) The results in part (a) show that C? = G(0, T) @ G(5, T). 
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One of our major goals in this chapter is to show that the result in part (b) 
of the example above holds in general for operators on finite-dimensional 
complex vector spaces; we will do this in 8.21. 

We saw earlier (5.10) that eigenvectors corresponding to distinct eigenval- 
ues are linearly independent. Now we prove a similar result for generalized 
eigenvectors. 


8.13 Linearly independent generalized eigenvectors 


Let T € L(V). Suppose A1,...,Am are distinct eigenvalues of T and 
Viloooas Vm are corresponding generalized eigenvectors. Then v1,...,Vm 
is linearly independent. 


Proof Suppose ayj,..., am are complex numbers such that 

8.14 0 = avı +++: + 4mm. 

Let k be the largest nonnegative integer such that (T — A,/)*v, 4 0. Let 
w= (T-Ail) v1. 


Thus 

(T —AyDw = (T — 11 )¥t!w = 0, 
and hence Tw = A,w. Thus (T — AJ)w = (A; — A)w for every À € F and 
hence 


8.15 (T =A0"w = (A, —à)”w 


for every À € F, where n = dim V. 
Apply the operator 


(T =a T =A + (T = ATY" 
to both sides of 8.14, getting 
0 = att -1 DE (T — A21)" --- (T — àm I)” v1 
= aı(T —AzI)" (7 = Am!)"w 
= a1 (å1 —A2)" +++ (Ai —Am)"w, 
where we have used 8.11 to get the first equation above and 8.15 to get the 
last equation above. 


The equation above implies that aj = 0. In a similar fashion, a; = 0 for 
each j, which implies that v1, ... , Vm is linearly independent. n 
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Nilpotent Operators 


8.16 Definition nilpotent 


An operator is called nilpotent if some power of it equals 0. 


8.17 Example nilpotent operators 


(a) The operator N € L(F*) defined by 
N(21, Z2, 23,24) = (Z3, Z4, 0,0) 
is nilpotent because N? = 0. 


(b) The operator of differentiation on Pm(R) is nilpotent because the 
(m + 1)* derivative of every polynomial of degree at most m equals 0. 
Note that on this space of dimension m + 1, we need to raise the 
nilpotent operator to the power m + 1 to get the 0 operator. 


The next result shows that we never 
need to use a power higher than the di- 
mension of the space. 


The Latin word nil means noth- 
ing or zero; the Latin word potent 
means power. Thus nilpotent liter- 
ally means zero power. 


8.18 Nilpotent operator raised to dimension of domain is 0 
Suppose N € L(V) is nilpotent. Then N“™Y = 0. 


Proof Because N is nilpotent, G(0, N) = V. Thus 8.11 implies that 
null NU™V = V, as desired. | 


Given an operator T on V, we want to find a basis of V such that the 
matrix of 7 with respect to this basis is as simple as possible, meaning that 
the matrix contains many 0’s. 


If V is a complex vector space, a The next result shows that if N is 
proof of the next result follows eas- nilpotent, then we can choose a basis 
ily from Exercise 7, 5.27, and 5.32. of V such that the matrix of N with 
But the proof given here uses sim-\, respect to this basis has more than half 
pler ideas than needed to prove of its entries equal to 0. Later in this 
5.27, and it works for both real and chapter we will do even better. 

complex vector spaces. 
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8.19 Matrix of a nilpotent operator 


Suppose N is a nilpotent operator on V. Then there is a basis of V with 
respect to which the matrix of N has the form 


0 * 
0 0 
here all entries on and below the diagonal are 0’s. 


Proof First choose a basis of null N. Then extend this to a basis of null N?. 
Then extend to a basis of null N3. Continue in this fashion, eventually getting 
a basis of V (because 8.18 states that null V“™" V= V). 

Now let’s think about the matrix of N with respect to this basis. The 
first column, and perhaps additional columns at the beginning, consists of 
all 0’s, because the corresponding basis vectors are in null N. The next set 
of columns comes from basis vectors in null N?. Applying N to any such 
vector, we get a vector in null N; in other words, we get a vector that is a 
linear combination of the previous basis vectors. Thus all nonzero entries in 
these columns lie above the diagonal. The next set of columns comes from 
basis vectors in null N3. Applying N to any such vector, we get a vector in 
null V2; in other words, we get a vector that is a linear combination of the 
previous basis vectors. Thus once again, all nonzero entries in these columns 
lie above the diagonal. Continue in this fashion to complete the proof. m 


EXERCISES 8.A 


1 Define T € £L(C?) by 
T(w,z) = (z,0). 


Find all generalized eigenvectors of T. 


2 Define T € L(C?) by 
T(w,z) = (—z,w). 


Find the generalized eigenspaces corresponding to the distinct eigenval- 
ues of T. 
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Suppose T € L(V) is invertible. Prove that G(A, T) = GG, TT!) for 
every A € F with à Æ 0. 


Suppose T € L(V) anda, p € F witha Æ 8. Prove that 
G(a,T)N G(,T) = {0}. 


Suppose T € £(V), m is a positive integer, and v € V is such that 
T™”—ly Æ 0 but Tv = 0. Prove that 


v, Tv, T?v,..., T” ly 
is linearly independent. 


Suppose T € £(C?) is defined by T (z1, 22,23) = (Z2,23,0). Prove 
that T has no square root. More precisely, prove that there does not exist 
S € L(C?) such that S? = T. 


Suppose N € L(V) is nilpotent. Prove that 0 is the only eigenvalue 
of N. 


Prove or give a counterexample: The set of nilpotent operators on V is a 
subspace of L(V). 


Suppose S,7 € L(V) and ST is nilpotent. Prove that TS is nilpotent. 


Suppose that T € L(V) is not nilpotent. Let n = dim V. Show that 
V = null T”! @ range T” |. 


Prove or give a counterexample: If V is a complex vector space and 
dim V = n and T € L(V), then T” is diagonalizable. 


Suppose N € L(V) and there exists a basis of V with respect to which 
N has an upper-triangular matrix with only 0’s on the diagonal. Prove 
that N is nilpotent. 


Suppose V is an inner product space and N € L(V) is normal and 
nilpotent. Prove that N = 0. 


Suppose V is an inner product space and N € L(V) is nilpotent. Prove 
that there exists an orthonormal basis of V with respect to which N has 
an upper-triangular matrix. 

Uf F = C, then the result above follows from Schur’s Theorem (6.38) 
without the hypothesis that N is nilpotent. Thus the exercise above needs 
to be proved only when F = R.] 
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Suppose N € L(V) is such that null N@™’—! Æ null N#™Y. Prove 
that N is nilpotent and that 


dim null NJ = j 

for every integer j withO < j < dim V. 
Suppose T € L(V). Show that 

V = range T° D range T! D --- D range TED range TL gasy 
Suppose T € L(V) and m is a nonnegative integer such that 

range T” = range T™*!, 
Prove that range T* = range T” for all k > m. 
Suppose T € L(V). Let n = dim V. Prove that 
range T” = range T”™! = range T”? =... . 

Suppose T € L(V) and m is a nonnegative integer. Prove that 


null T” = null T”! ifandonlyif range T” = range T”*!. 
Suppose T € £L(C?) is such that range T4 Æ range T>. Prove that T is 


nilpotent. 


Find a vector space W and T € L(W) such that null T* Ç null Tkt! 
and range T% > range TEH! for every positive integer k. 
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8.B Decomposition of an Operator 


Description of Operators on Complex Vector Spaces 


We saw earlier that the domain of an operator might not decompose into 
eigenspaces, even on a finite-dimensional complex vector space. In this 
section we will see that every operator on a finite-dimensional complex vector 
space has enough generalized eigenvectors to provide a decomposition. 

We observed earlier that if T € L(V), then null T and range T are invari- 
ant under T [see 5.3, parts (c) and (d)]. Now we show that the null space and 
the range of each polynomial of T is also invariant under T. 


8.20 The null space and range of p(T) are invariant under T 
Suppose T € L(V) and p € P(F). Then null p(T) and range p(T) are 


invariant under T. 


Proof Suppose v € null p(T). Then p(T)v = 0. Thus 


((p(T))(Tv) = T(p(T)v) = TO) = 0. 


Hence Tv € null p(T). Thus null p(T) is invariant under T, as desired. 
Suppose v € range p(T). Then there exists u € V such that v = p(T )u. 
Thus 


Tv = T(p(T)u) = p(T)(Tu). 


Hence Tv € range p(T). Thus range p(T) is invariant under T, as desired. m 


The following major result shows that every operator on a complex vector 
space can be thought of as composed of pieces, each of which is a nilpotent 
operator plus a scalar multiple of the identity. Actually we have already done 
the hard work in our discussion of the generalized eigenspaces G(A, T), so at 
this point the proof is easy. 

8.21 Description of operators on complex vector spaces 

Suppose V is a complex vector space and T € L(V). Let A1,...,Am be 
the distinct eigenvalues of T. Then 

(b) each G(A;, T) is invariant under T; 


(c) each (T — Aj; 1)|Gq,,7) is nilpotent. 
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Proof Letn = dim V. Recall that G(A;,7) = null(T — à; I)” for each j 
(by 8.11). From 8.20 [with p(z) = (z — A;)"], we get (b). Obviously (c) 
follows from the definitions. 

We will prove (a) by induction on n. To get started, note that the desired 
result holds if n = 1. Thus we can assume that n > 1 and that the desired 
result holds on all vector spaces of smaller dimension. 

Because V is a complex vector space, T has an eigenvalue (see 5.21); thus 
m > 1. Applying 8.5 to T — 1 Z shows that 


8.22 V=GO1,7) @U, 


where U = range(T — A, /)”. Using 8.20 [with p(z) = (z — A1)”], we see 
that U is invariant under T. Because G(A;, T) 4 {0}, we have dim U <n. 
Thus we can apply our induction hypothesis to T |v. 

None of the generalized eigenvectors of T |y correspond to the eigenvalue 
A1, because all generalized eigenvectors of T corresponding to A, are in 
G(A,, T). Thus each eigenvalue of T |y is in {A2,..., Am} 

By our induction hypothesis, U = G(A2,T|v) ®::- ® GAm,T |v). 
Combining this information with 8.22 will complete the proof if we can show 
that G(Ak, Tlu) = GOA,x,T) fork =2,...,m. 

Thus fix k € {2,..., m}. The inclusion G(A;, Tlu) C Gx, T) is clear. 

To prove the inclusion in the other direction, suppose v € G(Az, T). By 
8.22, we can write v = vy + u, where vy € G(Ay,T) andu € U. Our 
induction hypothesis implies that 


u = v2 bP Vm, 
where each v; is in G(A;, T |y), which is a subset of G(A ;, T). Thus 
v = v1 + v2 +: + Vm, 


Because generalized eigenvectors corresponding to distinct eigenvalues are 
linearly independent (see 8.13), the equation above implies that each v ; equals 
0 except possibly when j = k. In particular, vı = 0 and thus v = u € U. 
Because v € U, we can conclude that v € G(A,z, 7 |v), completing the 
proof. m 


As we know, an operator on a complex vector space may not have enough 
eigenvectors to form a basis of the domain. The next result shows that on a 
complex vector space there are enough generalized eigenvectors to do this. 
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8.23 A basis of generalized eigenvectors 


Suppose V is a complex vector space and T € L(V). Then there is a basis 
of V consisting of generalized eigenvectors of T. 


Proof Choose a basis of each G(A;, T) in 8.21. Put all these bases together 
to form a basis of V consisting of generalized eigenvectors of 7. E 


Multiplicity of an Eigenvalue 


If V is a complex vector space and T € L(V), then the decomposition of V 
provided by 8.21 can be a powerful tool. The dimensions of the subspaces 
involved in this decomposition are sufficiently important to get a name. 


8.24 Definition multiplicity 


e Suppose T € L(V). The multiplicity of an eigenvalue 4 of T 
is defined to be the dimension of the corresponding generalized 
eigenspace G(A, T). 


e In other words, the multiplicity of an eigenvalue A of T equals 
dim null (T — A #™ Y. 


The second bullet point above is justified by 8.11. 


8.25 Example Suppose T € £(C?) is defined by 
T (z1, 22,23) = (621 + 3Z2 + 4723, 6Z2 + 223, 723). 
The matrix of T (with respect to the standard basis) is 
6 3 4 
0 6 2 
0 0 7 


The eigenvalues of T are 6 and 7, as follows from 5.32. You can verify that 
the generalized eigenspaces of T are as follows: 

G(6, T) = span((1,0,0),(0,1,0)) and G(7, T) = span((10, 2, 1)). 
Thus the eigenvalue 6 has multiplicity 2 and the eigenvalue 7 has multiplicity 1. 

The direct sum C? = G(6, T)® G(7, T) is the decomposition promised by 
8.21. A basis of C? consisting of generalized eigenvectors of T, as promised 
by 8.23, is 

(1,0, 0), (0, 1,0), (10, 2, 1). 
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In Example 8.25, the sum of the multiplicities of the eigenvalues of T 
equals 3, which is the dimension of the domain of T. The next result shows 
that this always happens on a complex vector space. 


8.26 Sum of the multiplicities equals dim V 


Suppose V is a complex vector space and T € L(V). Then the sum of the 
multiplicities of all the eigenvalues of T equals dim V. 


Proof The desired result follows from 8.21 and the obvious formula for the 
dimension of a direct sum (see 3.78 or Exercise 16 in Section 2.C). = 


The terms algebraic multiplicity and geometric multiplicity are used in 
some books. In case you encounter this terminology, be aware that the 
algebraic multiplicity is the same as the multiplicity defined here and the 
geometric multiplicity is the dimension of the corresponding eigenspace. In 
other words, if T € L(V) and A is an eigenvalue of T, then 


algebraic multiplicity of A = dimnull(T — AJ)“ = dimG(A, T), 
geometric multiplicity of A = dim null(T — AJ) = dim E (å, T). 
Note that as defined above, the algebraic multiplicity also has a geometric 
meaning as the dimension of a certain null space. The definition of multiplicity 


given here is cleaner than the traditional definition that involves determinants; 
10.25 implies that these definitions are equivalent. 


Block Diagonal Matrices 


To interpret our results in matrix form, Often we can understand a matrix 
we make the following definition, gener- | better by thinking of it as composed 
alizing the notion of a diagonal matrix. | of smaller matrices. 

If each matrix A; in the definition ' 
below is a 1-by-1 matrix, then we actually have a diagonal matrix. 


8.27 Definition block diagonal matrix 


A block diagonal matrix is a square matrix of the form 


Ay 0 
0 Am 
where A1,..., Am are square matrices lying along the diagonal and all 


the other entries of the matrix equal 0. 
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8.28 Example The 5-by-5 matrix 
(4) 0 0 0 0 


0 2. -=3 0 
A= 0 0 2 0 


0 0 0 1 7 
0 0 0 0 1 


is a block diagonal matrix with 


© © 


Ay 0 


where 


wat. ae(3 2) 9-(57) 


Here the inner matrices in the 5-by-5 matrix above are blocked off to show 
how we can think of it as a block diagonal matrix. 


Note that in the next result we get many more zeros in the matrix of T 
than are needed to make it upper triangular. 


8.29 Block diagonal matrix with upper-triangular blocks 


Suppose V is a complex vector space and T € L(V). Let A1,...,Am be 
the distinct eigenvalues of T, with multiplicities d,,..., dm. Then there is 
a basis of V with respect to which T has a block diagonal matrix of the 


form 
Aj 0 


where each A; is a d;-by-d; upper-triangular matrix of the form 


Àj * 


0 o 
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Proof Each (T —Aj;J)|Gqa;,7) is nilpotent [see 8.21(c)]. For each j, choose 
a basis of G(A;, T), which is a vector space with dimension d;, such that the 
matrix of (T — A;1)|G(;,7) with respect to this basis is as in 8.19. Thus the 
matrix of T|G(a;,7), which equals (T — å; Dlea;, r) + Al leéqa;,7), with 
respect to this basis will look like the desired form shown above for A ;. 
Putting the bases of the G(A j, T’)’s together gives a basis of V [by 8.21(a)]. 
The matrix of 7 with respect to this basis has the desired form. E 


The 5-by-5 matrix in 8.28 is of the form promised by 8.29, with each of 
the blocks itself an upper-triangular matrix that is constant along the diagonal 
of the block. If T is an operator on a 5-dimensional vector space whose matrix 
is as in 8.28, then the eigenvalues of T are 4,2, 1 (as follows from 5.32), with 
multiplicities 1, 2, 2. 


8.30 Example Suppose T € £(C?) is defined by 
T (Z1, Z2, Z3) = (621 + 3Z2 + 423, 6Z2 + 223, 723). 
The matrix of T (with respect to the standard basis) is 


6 3 4 
0 6 2 
0 0 7 


which is an upper-triangular matrix but is not of the form promised by 8.29. 
As we saw in Example 8.25, the eigenvalues of T are 6 and 7 and the 
corresponding generalized eigenspaces are 


G(6, T) = span((1,0,0),(0,1,0)) and G(7,7) = span((10,2, 1)). 
We also saw that a basis of C? consisting of generalized eigenvectors of T is 
(1, 0, 0), (0, 1, 0), (10, 2; 1). 

The matrix of T with respect to this basis is 
6 3 0 
0 6 0 f 
00 (7) 


which is a matrix of the block diagonal form promised by 8.29. 


When we discuss the Jordan Form in Section 8.D, we will see that we can 
find a basis with respect to which an operator T has a matrix with even more 
0’s than promised by 8.29. However, 8.29 and its equivalent companion 8.21 
are already quite powerful. For example, in the next subsection we will use 
8.21 to show that every invertible operator on a complex vector space has a 
square root. 
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Square Roots 


Recall that a square root of an operator T € L(V) is an operator R € L(V) 
such that R? = T (see 7.33). Every complex number has a square root, but 
not every operator on a complex vector space has a square root. For example, 
the operator on CÌ in Exercise 6 in Section 8.A has no square root. The 
noninvertibility of that operator is no accident, as we will soon see. We begin 
by showing that the identity plus any nilpotent operator has a square root. 


8.31 Identity plus nilpotent has a square root 


Suppose N € L(V) is nilpotent. Then J + N has a square root. 


Proof Consider the Taylor series for the function y 1 + x: 


8.32 V1 +x =1+ayx+aogx7+---. 

Because a, = 1/2, the formula We will not find an explicit formula 
above shows that 1 + x/2 is a for the coefficients or worry about 
good estimate for 1 + x when x whether the infinite sum converges be- 
is small. cause we will use this equation only as 
eee “motivation. 


Because N is nilpotent, N™” = 0 for some positive integer m. In 8.32, 
suppose we replace x with N and 1 with Z. Then the infinite sum on the right 
side becomes a finite sum (because NV J = 0 for all j > m). In other words, 
we guess that there is a square root of J + N of the form 


I +a N + a.N? +--+ ama NL. 


Having made this guess, we can try to choose a1, d2,...,@m-—1 such that the 
operator above has its square equal to J + N. Now 


(I+aıN +a2N? +a3N? +--+ am- N”? 
= I + 2aıN + (2a2 +41?) N? + (2a3 + 2a1a2) N? +- 
+ (2am—1 + terms involving a1,..., Bane, 


We want the right side of the equation above to equal J + N. Hence choose a1 
such that 2a; = 1 (thus ay = 1/2). Next, choose a2 such that 2a2 + a? =0 
(thus a2 = —1/8). Then choose a3 such that the coefficient of N? on the 
right side of the equation above equals 0 (thus a3 = 1/16). Continue in 
this fashion for j = 4,...,m — 1, at each step solving for a; so that the 
coefficient of N/ on the right side of the equation above equals 0. Actually 
we do not care about the explicit formula for the a ;’s. We need only know 
that some choice of the a;’s gives a square root of J + N. E 
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The previous lemma is valid on real and complex vector spaces. However, 
the next result holds only on complex vector spaces. For example, the operator 
of multiplication by —1 on the 1-dimensional real vector space R has no square 
root. 


8.33 Over C, invertible operators have square roots 


Suppose V is a complex vector space and T € L(V) is invertible. Then 
T has a square root. 


Proof Let Aj,..., Am be the distinct eigenvalues of T. For each j, there ex- 
ists a nilpotent operator N; € £(G(A;, T)) such that Tilea; T) = AI +Nj 
[see 8.21(c)]. Because T is invertible, none of the A ;’s equals 0, so we can 
write 7 
= jy ai 
Tlea;.7) =A; (1 + a 
for each j. Clearly N;/A; is nilpotent, and so J + N;/À; has a square root 
(by 8.31). Multiplying a square root of the complex number A ; by a square 
root of J + Nj/Aj, we obtain a square root Rj of T|G(a ;,7)- 
A typical vector v € V can be written uniquely in the form 


v= uy +-+ um, 


where each u; is in G(A;, T) (see 8.21). Using this decomposition, define an 
operator R € L(V) by 


Rv = Ryui +--+ + Roum. 


You should verify that this operator R is a square root of T, completing the 
proof. m 


By imitating the techniques in this section, you should be able to prove 
that if V is a complex vector space and T € L(V) is invertible, then T has a 
k™ root for every positive integer k. 


EXERCISES 8.B 


1 Suppose V is a complex vector space, N € L(V), and 0 is the only 
eigenvalue of N. Prove that N is nilpotent. 


2 Give an example of an operator T on a finite-dimensional real vector 
space such that 0 is the only eigenvalue of T but T is not nilpotent. 
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Suppose T € L(V). Suppose S € L(V) is invertible. Prove that T and 
S~!TS have the same eigenvalues with the same multiplicities. 


Suppose V is an n-dimensional complex vector space and T is an oper- 
ator on V such that null T”72 Æ null 7”~!. Prove that T has at most 
two distinct eigenvalues. 


Suppose V is a complex vector space and T € L(V). Prove that V has 
a basis consisting of eigenvectors of T if and only if every generalized 
eigenvector of T is an eigenvector of T. 

[For F = C, the exercise above adds an equivalence to the list in 5.41.] 


Define N € L(F°) by 
N(X1, X2, X3, X4, X5) = (2X2, 3x3, —X4, 4X5, 0). 
Find a square root of J + N. 


Suppose V is acomplex vector space. Prove that every invertible operator 
on V has a cube root. 


Suppose T € L(V) and 3 and 8 are eigenvalues of T. Let n = dim V. 
Prove that V = (null T”~7) ® (range T”~7). 


Suppose A and B are block diagonal matrices of the form 


Ay 0 Bı 0 
A= a , B= fa , 

0 Am 0 Bm 
where A; has the same size as B; for j = 1,...,m. Show that AB is a 
block diagonal matrix of the form 

A,B, 0 
AB = a 
0 AmBm 


Suppose F = C and T € L(V). Prove that there exist D, N € L(V) 
such that T = D + N, the operator D is diagonalizable, N is nilpotent, 
and DN = ND. 


Suppose T € L(V) and A € F. Prove that for every basis of V with 
respect to which T has an upper-triangular matrix, the number of times 
that A appears on the diagonal of the matrix of T equals the multiplicity 
of A as an eigenvalue of T. 
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8.C Characteristic and Minimal Polynomials 


The Cayley—Hamilton Theorem 

The next definition associates a polynomial with each operator on V if F = C. 

For F = R, the corresponding definition will be given in the next chapter. 
8.34 Definition characteristic polynomial 


Suppose V is a complex vector space and T € L(V). Let Aq,...,Am 
denote the distinct eigenvalues of T, with multiplicities d1, ..., dm. The 
polynomial 

(Z = Aq) -+ (Z — Am)” 


is called the characteristic polynomial of T. 


8.35 Example Suppose T € £(C?) is defined as in Example 8.25. Be- 
cause the eigenvalues of T are 6, with multiplicity 2, and 7, with multiplicity 1, 
we see that the characteristic polynomial of T is (z — 6)? (z — 7). 


8.36 Degree and zeros of characteristic polynomial 


Suppose V is a complex vector space and T € L(V). Then 


(a) the characteristic polynomial of T has degree dim V; 


(b) the zeros of the characteristic polynomial of T are the eigenvalues 
of T. 


Proof Clearly part (a) follows from 8.26 and part (b) follows from the defini- 
tion of the characteristic polynomial. E 


Most texts define the characteristic polynomial using determinants (the 
two definitions are equivalent by 10.25). The approach taken here, which 
is considerably simpler, leads to the following easy proof of the Cayley— 
Hamilton Theorem. In the next chapter, we will see that this result also holds 
on real vector spaces (see 9.24). 


8.37 Cayley—Hamilton Theorem 


Suppose V is a complex vector space and T € L(V). Let q denote the 
characteristic polynomial of T. Then g(T) = 0. 
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English mathematician Arthur | Proof Let Aj,..., Am be the distinct 
Cayley (1821-1895) published eigenvalues of the operator 7, and let 
three math papers before complet- d,,...,dm be the dimensions of the 
ing his undergraduate degree in corresponding generalized eigenspaces 
1842. Irish mathematician William G(A,,T),...,G(Am.T). For each 
Rowan Hamilton (1805-1865) was joe thea m}, we know that 


made a professor in 1827 when Cone 
he was 22 years old and still an (e AMNGA;.7) is nilpotent, Thus 


undergraduate! we have (T — Aj; D% lea; ,T) = 0 (by 
z 8.18). 

Every vector in V is a sum of vectors in G(A1, T),..., G(Am, T) (by 8.21). 
Thus to prove that q(T) = 0, we need only show that ¢(T)|Gq ;,7) = 9 for 
each j. 

Thus fix j € {1,...,m}. We have 


q(T) = (T= D” oT — Am D”. 


The operators on the right side of the equation above all commute, so we can 
move the factor (T — À; I )47 to be the last term in the expression on the right. 
Because (T — A; ie IG(;,T) = 9, we conclude that ¢(T)|G(a ;,7) = 0, as 
desired. a 


The Minimal Polynomial 


In this subsection we introduce another important polynomial associated with 
each operator. We begin with the following definition. 


8.38 Definition monic polynomial 


A monic polynomial is a polynomial whose highest-degree coefficient 
equals 1. 


8.39 Example The polynomial 2 + 9z? + z’ is a monic polynomial of 
degree 7. 


8.40 Minimal polynomial 


Suppose T €e L(V). Then there is a unique monic polynomial p of 
smallest degree such that p(T) = 0. 
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Proof Letn = dim V. Then the list 
ILT,T?,... T 


is not linearly independent in L(V), because the vector space L(V) has 
dimension n? (see 3.61) and we have a list of length n? + 1. Let m be the 
smallest positive integer such that the list 


8.41 LET oak T” 


is linearly dependent. The Linear Dependence Lemma (2.21) implies that one 
of the operators in the list above is a linear combination of the previous ones. 
Because m was chosen to be the smallest positive integer such that the list 
above is linearly dependent, we conclude that T” is a linear combination of 
I,T,T?,...,7™—!. Thus there exist scalars a9,d1,42,...,dm—1 € F such 
that 


8.42 aol + ayT +427? +--+ am T”! +T” =0. 
Define a monic polynomial p € P(F) by 
p(z) = ao + 41Z +422? +- + Amz" 1 + z”. 


Then 8.42 implies that p(T) = 0. 

To prove the uniqueness part of the result, note that the choice of m implies 
that no monic polynomial q € P(F) with degree smaller than m can satisfy 
q(T) = 0. Suppose q € P(F) is a monic polynomial with degree m and 
q(T) = 0. Then (p — g)(T) = 0 and deg(p — q) < m. The choice of m now 
implies that q = p, completing the proof. m 


The last result justifies the following definition. 


8.43 Definition minimal polynomial 


Suppose T € L(V). Then the minimal polynomial of T is the unique 
monic polynomial p of smallest degree such that p(T) = 0. 


The proof of the last result shows that the degree of the minimal polynomial 
of each operator on V is at most (dim V)?. The Cayley-Hamilton Theorem 
(8.37) tells us that if V is a complex vector space, then the minimal polynomial 
of each operator on V has degree at most dim V. This remarkable improvement 
also holds on real vector spaces, as we will see in the next chapter. 
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Suppose you are given the matrix (with respect to some basis) of an 
operator T € L(V). You could program a computer to find the minimal 
polynomial of T as follows: Consider the system of linear equations 


844  aoM(I)+aM(T) +--+ am- M(T)"! = -M(T)” 


Think of this as a system of} for successive values of m = 1,2,... 
(dim V)? linear equations in m until this system of equations has a solu- 
variables ag,d1,...,dm—1.- tion dg, @1,d2,...,dm—,. The scalars 

do.@1,42,...,Am—1, | will then be the 


coefficients of the minimal polynomial of T. All this can be computed using a 
familiar and fast (for a computer) process such as Gaussian elimination. 


8.45 Example Let 7 be the operator on C? whose matrix (with respect 
to the standard basis) is 


000 0 -3 

1000 6 

010 0 0 

0010 0 

000 1 0 
Find the minimal polynomial of T. 


Solution Because of the large number of 0’s in this matrix, Gaussian elim- 
ination is not needed here. Simply compute powers of M(T), and then 
you will notice that there is clearly no solution to 8.44 until m = 5. Do 
the computations and you will see that the minimal polynomial of T equals 
z> —6z +3. 


The next result completely characterizes the polynomials that when applied 
to an operator give the 0 operator. 


8.46 q(T) = 0 implies q is a multiple of the minimal polynomial 
Suppose T € L(V) and q € P(F). Then g(7) = 0 if and only if q isa 


polynomial multiple of the minimal polynomial of 7. 


Proof Let p denote the minimal polynomial of T. 
First we prove the easy direction. Suppose q is a polynomial multiple of p. 
Thus there exists a polynomial s € P(F) such that q = ps. We have 


q(T) = p(T)s(T) = 0s(T) = 0, 


as desired. 


SECTION 8.C Characteristic and Minimal Polynomials 265 


To prove the other direction, now suppose q(T) = 0. By the Division 
Algorithm for Polynomials (4.8), there exist polynomials s,r € P(F) such 
that 


8.47 q=pst+r 
and degr < deg p. We have 
0= q(T) = p(T)s(T) + r(T) = rT). 


The equation above implies that r = 0 (otherwise, dividing r by its highest- 
degree coefficient would produce a monic polynomial that when applied to 
T gives 0; this polynomial would have a smaller degree than the minimal 
polynomial, which would be a contradiction). Thus 8.47 becomes the equation 
q = ps. Hence q is a polynomial multiple of p, as desired. m 


The next result is stated only for complex vector spaces, because we have 
not yet defined the characteristic polynomial when F = R. However, the 
result also holds for real vector spaces, as we will see in the next chapter. 


8.48 Characteristic polynomial is a multiple of minimal polynomial 


Suppose F = C and T € L(V). Then the characteristic polynomial of T 
is a polynomial multiple of the minimal polynomial of T. 


Proof The desired result follows immediately from the Cayley—Hamilton 
Theorem (8.37) and 8.46. C] 


We know (at least when F = C) that the zeros of the characteristic 
polynomial of T are the eigenvalues of T (see 8.36). Now we show that the 
minimal polynomial has the same zeros (although the multiplicities of these 
zeros may differ). 


8.49 Eigenvalues are the zeros of the minimal polynomial 


Let T € L(V). Then the zeros of the minimal polynomial of T are 
precisely the eigenvalues of T. 


Proof Let 
p(z) = ao + 412 + azz? as Ange gg 7! 42m 


be the minimal polynomial of 7. 


266 CHAPTER 8 Operators on Complex Vector Spaces 


First suppose A € F is a zero of p. Then p can be written in the form 


p(z) = (z—A)q(z), 


where q is a monic polynomial with coefficients in F (see 4.11). Because 
p(T) = 0, we have 
0 =(T-AN@(T)) 


for all v € V. Because the degree of q is less than the degree of the minimal 
polynomial p, there exists at least one vector v € V such that g(T)v Æ 0. 
The equation above thus implies that À is an eigenvalue of T, as desired. 

To prove the other direction, now suppose À € F is an eigenvalue of T. 
Thus there exists v € V with v Æ 0 such that Tv = Av. Repeated applications 
of T to both sides of this equation show that T/v = A/v for every nonnegative 
integer 7. Thus 


0 = p(T) = (aol +aıT +anT? +--+ am—1T™ 1 + T”)v 
= (ag + 41À + ash eb ged A 
= p(À)v. 


Because v Æ 0, the equation above implies that p(A) = 0, as desired. E 


The next three examples show how our results can be useful in finding 
minimal polynomials and in understanding why eigenvalues of some operators 
cannot be exactly computed. 


8.50 Example Find the minimal polynomial of the operator T € L(C?) 
in Example 8.30. 


Solution In Example 8.30 we noted that the eigenvalues of T are 6 and 7. 
Thus by 8.49, the minimal polynomial of T is a polynomial multiple of 
(z — 6)(z —7). 

In Example 8.35, we saw that the characteristic polynomial of T is 
(z — 6)*(z — 7). Thus by 8.48 and the paragraph above, the minimal polyno- 
mial of T is either (z — 6)(z — 7) or (z — 6)*(z — 7). A simple computation 
shows that 

(T —61)(T —71) #0. 


Thus the minimal polynomial of T is (z — 6)?(z — 7). 
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8.51 Example Find the minimal polynomial of the operator T € £(C?) 
defined by T (z1, Z2, Z3) = (621, 622, 723). 


Solution It is easy to see that for this operator T, the eigenvalues of T are 6 
and 7, and the characteristic polynomial of T is (z — 6)?(z — 7). 

Thus as in the previous example, the minimal polynomial of T is ei- 
ther (z — 6)(z — 7) or (z — 6)?(z — 7). A simple computation shows that 
(T —61)(T —7I) = 0. Thus the minimal polynomial of T is (z — 6)(z — 7). 


8.52 Example What are the eigenvalues of the operator in Example 8.45? 


Solution From 8.49 and the solution to Example 8.45, we see that the 
eigenvalues of T equal the solutions to the equation 


2 —6z+3=0. 


Unfortunately, no solution to this equation can be computed using rational 
numbers, roots of rational numbers, and the usual rules of arithmetic (a proof 
of this would take us considerably beyond linear algebra). Thus we cannot find 
an exact expression for any eigenvalue of T in any familiar form, although 
numeric techniques can give good approximations for the eigenvalues of 
T. The numeric techniques, which we will not discuss here, show that the 
eigenvalues for this particular operator are approximately 


—1.67, 0.51, 1.40, —0.12 + 1.59%, —0.12-— 1.59i. 


The nonreal eigenvalues occur as a pair, with each the complex conjugate of 
the other, as expected for a polynomial with real coefficients (see 4.15). 


EXERCISES 8.C 


1 Suppose T € £(C*) is such that the eigenvalues of T are 3, 5, 8. Prove 
that (T — 31)? (T — 51)? (T — 81}? = 0. 


2 Suppose V is a complex vector space. Suppose T € L(V) is such that 5 
and 6 are eigenvalues of T and that T has no other eigenvalues. Prove 
that (T — 57)”71 (T — 61)"—! = 0, where n = dim V. 


3 Give an example of an operator on C4 whose characteristic polynomial 
equals (z — 7)?(z — 8}. 
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Give an example of an operator on C* whose characteristic polyno- 
mial equals (z — 1)(z — 5)? and whose minimal polynomial equals 
(z= 1)(z — 5)”. 


Give an example of an operator on C4 whose characteristic and minimal 
polynomials both equal z(z — 1)?(z — 3). 


Give an example of an operator on Cf whose characteristic polyno- 
mial equals z(z — 1)?(z — 3) and whose minimal polynomial equals 
z(z —1)(z —3). 


Suppose V is a complex vector space. Suppose T € L(V) is such that 
P? = P. Prove that the characteristic polynomial of P is z” (z — 1)", 
where m = dimnull P andn = dimrange P. 


Suppose T € L(V). Prove that T is invertible if and only if the constant 
term in the minimal polynomial of T is nonzero. 


Suppose T € L(V) has minimal polynomial 4+5z—6z7—72z342z4+2z°. 
Find the minimal polynomial of T~!. 


Suppose V is a complex vector space and T e L(V) is invertible. 
Let p denote the characteristic polynomial of T and let q denote the 
characteristic polynomial of T~!. Prove that 


q€) = o) 


for all nonzero z € C. 


Suppose T € L(V) is invertible. Prove that there exists a polynomial 
p € P(F) such that T~! = p(T). 


Suppose V is a complex vector space and T € L(V). Prove that V 
has a basis consisting of eigenvectors of T if and only if the minimal 
polynomial of T has no repeated zeros. 

[For complex vector spaces, the exercise above adds another equivalence 
to the list given by 5.41.] 


Suppose V is an inner product space and T € L(V) is normal. Prove 
that the minimal polynomial of T has no repeated zeros. 


Suppose V is a complex inner product space and S € L(V) is an 
isometry. Prove that the constant term in the characteristic polynomial 
of S has absolute value 1. 
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Suppose T € L(V) and ve V. 


(a) Prove that there exists a unique monic polynomial p of smallest 
degree such that p(T)v = 0. 


(b) Prove that p divides the minimal polynomial of T. 
Suppose V is an inner product space and T € L(V). Suppose 
io aie as os ee ee 
is the minimal polynomial of T. Prove that 
Gg + A12 + 022° + +++ + Am2” + 2™ 
is the minimal polynomial of T*. 


Suppose F = C and T € L(V). Suppose the minimal polynomial of T 
has degree dim V. Prove that the characteristic polynomial of T equals 
the minimal polynomial of T. 


Suppose ag,...,@,—1 € C. Find the minimal and characteristic polyno- 
mials of the operator on C” whose matrix (with respect to the standard 
basis) is 


0 —dag 
1 0 —a{1 
| —a2 

0 —an-2 

l —an-1 


[The exercise above shows that every monic polynomial is the character- 
istic polynomial of some operator] 


Suppose V is a complex vector space and T € L(V). Suppose that 
with respect to some basis of V the matrix of T is upper triangular, with 
A1,-..,An on the diagonal of this matrix. Prove that the characteristic 
polynomial of T is (z — A1)---(z —An). 


Suppose V is a complex vector space and Vj,..., Vm are nonzero sub- 
spaces of V such that V = V; @--- ® Vin. Suppose T € L(V) and 
each V; is invariant under T. For each j, let p; denote the characteristic 
polynomial of T |y, . Prove that the characteristic polynomial of T equals 


Pı’: Pm. 
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8.D | Jordan Form 


We know that if V is a complex vector space, then for every T € L(V) there 
is a basis of V with respect to which T has a nice upper-triangular matrix (see 
8.29). In this section we will see that we can do even better—there is a basis 
of V with respect to which the matrix of T contains 0’s everywhere except 
possibly on the diagonal and the line directly above the diagonal. 

We begin by looking at two examples of nilpotent operators. 


8.53 Example Let N € L(F*) be the nilpotent operator defined by 

N (21,22, 23,24) = (0, 21, Z2, 23). 
If v = (1,0,0,0), then N?v, N2v, Nv, vis a basis of F*. The matrix of N 
with respect to this basis is 


0 10 0 
001 0 
000 1 
00 0 0 


The next example of a nilpotent operator has more complicated behavior 
than the example above. 


8.54 Example Let N €e L(F°) be the nilpotent operator defined by 
N(21, 22,23, 24,25, 26) = (0, 21, Z2, 0, Z4, 0). 

Unlike the nice behavior of the nilpotent operator of the previous exam- 
ple, for this nilpotent operator there does not exist a vector v € Fê such 
that N°v, N4v, N3v, N2v, Nv,v is a basis of F6. However, if we take 
vı = (1,0,0,0,0,0), v2 = (0,0,0,1,0,0), and v3 = (0,0,0,0,0, 1), then 
N?v1, Nv1, vi, Nv2, v2, V3 is a basis of F6. The matrix of N with respect to 
this basis is 


010 0 0 0 
001 0 0 0 
000 0 0 0 
000 01 0 
000 Ge 0 
000 00 (0) 


Here the inner matrices are blocked off to show that we can think of the 6-by-6 
matrix above as a block diagonal matrix consisting of a 3-by-3 block with 1’s 
on the line above the diagonal and 0’s elsewhere, a 2-by-2 block with 1 above 
the diagonal and 0’s elsewhere, and a 1-by-1 block containing 0. 
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Our next result shows that every nilpotent operator N € L(V) behaves 
similarly to the previous example. Specifically, there is a finite collection of 
vectors v1, ...,Vn E€ V such that there is a basis of V consisting of the vectors 
of the form N*y j»as j varies from 1 to n and k varies (in reverse order) from 
0 to the largest nonnegative integer m ; such that N”/v; A 0. For the matrix 
interpretation of the next result, see the first part of the proof of 8.60. 


8.55 Basis corresponding to a nilpotent operator 


Suppose N € L(V) is nilpotent. Then there exist vectors v1,...,V¥n E€ V 
and nonnegative integers m1,..., Mn such that 

(a) N™1y1,...,Nv1,V1,...,N""vy,...,Nvn, Vy is a basis of V; 
(Dy Ny = oo ee = 


Proof We will prove this result by induction on dim V. To get started, note 
that the desired result obviously holds if dim V = 1 (in that case, the only 
nilpotent operator is the 0 operator, so take vı to be any nonzero vector and 
mı = 0). Now assume that dim V > 1 and that the desired result holds on all 
vector spaces of smaller dimension. 

Because N is nilpotent, N is not injective. Thus N is not surjective (by 
3.69) and hence range N is a subspace of V that has a smaller dimension 
than V. Thus we can apply our induction hypothesis to the restriction operator 
N\rangen E€ L(range N). [We can ignore the trivial case range N = {0}, 
because in that case N is the 0 operator and we can choose v1,..., Vy, to be 
any basis of V and mı = --- = my, = 0 to get the desired result. ] 

By our induction hypothesis applied to N|range y, there exist vectors 
V1,---,V¥n € range N and nonnegative integers m1, ..., Mn such that 


8.56 N™y4,...,NV1,V1,...,N "vn, 2, NVn, Vn 
is a basis of range N and 
N™ tly =. = yen, = 0. 


Because each v; is in range N, for each j there exists u; € V such 
that v; = Nuj. Thus NE* a, = Ny; for each j and each nonnegative 
integer k. We now claim that 


8.57 Na a ccs y Nui, ui, ..., NO tins ony NUn, Un 
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is a linearly independent list of vectors in V. To verify this claim, suppose 
that some linear combination of 8.57 equals 0. Applying N to that linear 
combination, we get a linear combination of 8.56 equal to 0. However, the 
list 8.56 is linearly independent, and hence all the coefficients in our original 
linear combination of 8.57 equal 0 except possibly the coefficients of the 
vectors 

NT icq” teas 


which equal the vectors 
N”!v1,..., N”” vy. 


Again using the linear independence of the list 8.56, we conclude that those 
coefficients also equal 0, completing our proof that the list 8.57 is linearly 
independent. 

Now extend 8.57 to a basis 


8.58 N™ tlui, Nui, ui,- N” lun, 2.2, NUn, Un, Wis... Wp 


of V (which is possible by 2.33). Each Nw; is in range N and hence is in the 
span of 8.56. Each vector in the list 8.56 equals N applied to some vector in 
the list 8.57. Thus there exists x; in the span of 8.57 such that Nw; = N xj. 
Now let 

Un+j =Wj —X;. 


Then Nun+; = 0. Furthermore, 
OO iis np IN ua, ..., NO Wid ..., NUn, Un, Uni,- --,Un+p 


spans V because its span contains each x; and each un+ ; and hence each w j 
(and because 8.58 spans V). 

Thus the spanning list above is a basis of V because it has the same length 
as the basis 8.58 (where we have used 2.42). This basis has the required form, 
completing the proof. E 


French mathematician Camille Jor- In the next definition, the diagonal of 
dan (1838-1922) first published a each A; is filled with some eigenvalue 
proof of 8.60 in 1870. A; of T, the line directly above the di- 
a agonal of A; is filled with 1’s, and all 
other entries in A; are 0 (to understand why each A; is an eigenvalue of T, 
see 5.32). The A;’s need not be distinct. Also, A; may be a 1-by-1 matrix 
(A ;) containing just an eigenvalue of T. 
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8.59 Definition Jordan basis 


Suppose T € L(V). A basis of V is called a Jordan basis for T if with 
respect to this basis T has a block diagonal matrix 


Al 0 
0 Ap 
where each A ; is an upper-triangular matrix of the form 
A; 41 0 
Aj = 
1 
0 Àj 


8.60 Jordan Form 


Suppose V is a complex vector space. If T € L(V), then there is a basis 
of V that is a Jordan basis for T. 


Proof First consider a nilpotent operator N € L(V) and the vectors 
Viyeees Vn € V given by 8.55. For each j, note that N sends the first vector 
in the list N”/v;,..., Nvj, vj to 0 and that N sends each vector in this list 
other than the first vector to the previous vector. In other words, 8.55 gives a 
basis of V with respect to which N has a block diagonal matrix, where each 
matrix on the diagonal has the form 


0 1 0 
. 1 
0 0 


Thus the desired result holds for nilpotent operators. 
Now suppose T € L(V). Let A1,..., Am be the distinct eigenvalues of T. 
We have the generalized eigenspace decomposition 


V = G(A1,T) ®-::® G(Am, T), 


where each (T —A j Dla; ,T) İS nilpotent (see 8.21). Thus some basis of each 
G(A;, T) is a Jordan basis for (T — A; 1)|G(a,,,7) (See previous paragraph). 
Put these bases together to get a basis of V that is a Jordan basis for T. m 
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EXERCISES 8.D 


Find the characteristic polynomial and the minimal polynomial of the 
operator N in Example 8.53. 


Find the characteristic polynomial and the minimal polynomial of the 
operator N in Example 8.54. 


Suppose N € L(V) is nilpotent. Prove that the minimal polynomial of 
N is z™*!, where m is the length of the longest consecutive string of 
1’s that appears on the line directly above the diagonal in the matrix of 
N with respect to any Jordan basis for N. 


Suppose T € L(V) and vj,..., Vv, is a basis of V that is a Jordan basis 
for T. Describe the matrix of T with respect to the basis vy,...,Vv1 
obtained by reversing the order of the v’s. 


Suppose T € L(V) and v1,..., Vv» is a basis of V that is a Jordan basis 
for T. Describe the matrix of T? with respect to this basis. 


Suppose N € L(V) is nilpotent and v1,..., vy, and m1, ..., Mn are as 
in 8.55. Prove that N”!v1,..., N”” vp is a basis of null N. 

[The exercise above implies that n, which equals dim null N, depends 
only on N and not on the specific Jordan basis chosen for N.] 


Suppose p,q € P(C) are monic polynomials with the same zeros and q 
is a polynomial multiple of p. Prove that there exists T € £L(C 4) such 
that the characteristic polynomial of T is g and the minimal polynomial 
of T is p. 


Suppose V is a complex vector space and T € L(V). Prove that there 
does not exist a direct sum decomposition of V into two proper subspaces 
invariant under 7 if and only if the minimal polynomial of T is of the 
form (z — A)“™Y for some A € C. 
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Operators on Real Vector Spaces 


In the last chapter we learned about the structure of an operator on a finite- 
dimensional complex vector space. In this chapter, we will use our results 
about operators on complex vector spaces to learn about operators on real 
vector spaces. 

Our assumptions for this chapter are as follows: 


9.1 Notation F, V 
e F denotes R or C. 


e V denotes a finite-dimensional nonzero vector space over F. 


LEARNING OBJECTIVES FOR THIS CHAPTER 
m complexification of a real vector space 
= complexification of an operator on a real vector space 


m Operators on finite-dimensional real vector spaces have an 
eigenvalue or a 2-dimensional invariant subspace 


= characteristic polynomial and the Cayley-Hamilton Theorem 
= description of normal operators on a real inner product space 


m description of isometries on a real inner product space 
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9.A  Complexification 


Complexification of a Vector Space 


As we will soon see, a real vector space V can be embedded, in a natural way, 
in a complex vector space called the complexification of V. Each operator 
on V can be extended to an operator on the complexification of V. Our 
results about operators on complex vector spaces can then be translated to 
information about operators on real vector spaces. 

We begin by defining the complexification of a real vector space. 


9.2 Definition complexification of V, Vc 


Suppose V is a real vector space. 


e The complexification of V, denoted Vc, equals V x V. An element 
of Vc is an ordered pair (u, v), where u,v € V, but we will write 
this as u + iv. 


e Addition on Vc is defined by 
(u1 + ivi) + (u2 + iva) = (u1 + u2) + i(vı + v2) 
for u1, v1, U2, V2 E V. 
e Complex scalar multiplication on Vc is defined by 
(a + bi)(u + iv) = (au — bv) + i (av + bu) 


fora,b € R and u,v € V. 


Motivation for the definition above of complex scalar multiplication comes 
from usual algebraic properties and the identity į? = —1. If you remember 
the motivation, then you do not need to memorize the definition above. 

We think of V as a subset of Vc by identifying u € V with u + 70. 
The construction of Vc from V can then be thought of as generalizing the 
construction of C” from R”. 


9.3 Ve is a complex vector space. 
Suppose V is a real vector space. Then with the definitions of addition 


and scalar multiplication as above, Vc is a complex vector space. 


The proof of the result above is left as an exercise for the reader. Note that 
the additive identity of Vc is O + 10, which we write as just 0. 
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Probably everything that you think should work concerning complexifica- 
tion does work, usually with a straightforward verification, as illustrated by 
the next result. 


9.4 Basis of V is basis of Ve 


Suppose V is a real vector space. 


(a) Ifvy,...,v¥, isa basis of V (as areal vector space), then vj,..., Vn 
is a basis of Vc (as a complex vector space). 


(b) The dimension of Vc (as a complex vector space) equals the dimen- 
sion of V (as a real vector space). 


Proof To prove (a), suppose v1,..., Vy is a basis of the real vector space V. 
Then span(vı, . . . , vn ) in the complex vector space Vc contains all the vectors 
Veces Vn, İV1,..., iVn. Thus v1,...,Vvn spans the complex vector space Vc. 

To show that v1,...,vn is linearly independent in the complex vector 
space Vc, suppose Aj1,..., An E C and 


Aivi ++: +Anvn = 0. 
Then the equation above and our definitions imply that 
(ReA1)vı +---+(ReAn)vn =O and (Imå1)vi +---+ (Im åÀn)vn = 0. 


Because v1,..., Vn is linearly independent in V, the equations above imply 
Red, = -:: = Red, = 0 and ImdA, --» = Im, 0. Thus we have 
Ay vee An 0. Hence v,..., vn is linearly independent in Vc, 
completing the proof of (a). 

Clearly (b) follows immediately from (a). m 


Complexification of an Operator 


Now we can define the complexification of an operator. 


9.5 Definition complexification of T, Tc 


Suppose V is a real vector space and T € L(V). The complexification of 
T, denoted Tc, is the operator Te € L(Vc) defined by 


Telu +iv)= Tut+iTy 


for u,v E€ V. 
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You should verify that if V is areal vector space and T € L(V), then Te 
is indeed in £(Vc). The key point here is that our definition of complex scalar 
multiplication can be used to show that Te A(u +i v)) = ATc(u + iv) for 
all u,v € V and all complex numbers À. 

The next example gives a good way to think about the complexification of 
a typical operator. 


9.6 Example Suppose A is an n-by-n matrix of real numbers. Define 
T € L(R”) by Tx = Ax, where elements of R” are thought of as n-by-1 
column vectors. Identifying the complexification of R” with C”, we then 
have Tez = Az for each z € C”, where again elements of C” are thought of 
as n-by-1 column vectors. 

In other words, if T is the operator of matrix multiplication by A on R”, 
then the complexification Tc is also matrix multiplication by A but now acting 
on the larger domain C”. 


The next result makes sense because 9.4 tells us that a basis of a real vector 
space is also a basis of its complexification. The proof of the next result 
follows immediately from the definitions. 


9.7 Matrix of Tc equals matrix of T 


Suppose V is a real vector space with basis v1,...,vn and T € L(V). 
Then M(T) = M(Tc), where both matrices are with respect to the basis 
V1 Sipsie cers) Vn. 


The result above and Example 9.6 provide complete insight into complexi- 
fication, because once a basis is chosen, every operator essentially looks like 
Example 9.6. Complexification of an operator could have been defined using 
matrices, but the approach taken here is more natural because it does not 
depend on the choice of a basis. 

We know that every operator on a nonzero finite-dimensional complex 
vector space has an eigenvalue (see 5.21) and thus has a 1-dimensional in- 
variant subspace. We have seen an example [5.8(a)] of an operator on a 
nonzero finite-dimensional real vector space with no eigenvalues and thus no 
1-dimensional invariant subspaces. However, we now show that an invariant 
subspace of dimension 1 or 2 always exists. Notice how complexification 
leads to a simple proof of this result. 
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9.8 Every operator has an invariant subspace of dimension 1 or 2 


Every operator on a nonzero finite-dimensional vector space has an 
invariant subspace of dimension 1 or 2. 


Proof Every operator on a nonzero finite-dimensional complex vector space 
has an eigenvalue (5.21) and thus has a 1-dimensional invariant subspace. 

Hence assume V is a real vector space and T € L(V). The complexifica- 
tion Tc has an eigenvalue a + bi (by 5.21), where a,b € R. Thus there exist 
u,v € V, not both 0, such that Te (u + iv) = (a + bi)(u + iv). Using the 
definition of Tc, the last equation can be rewritten as 


Tu +iTv = (au — by) + (av + buji. 


Thus 
Tu =au—bv and Tv = av + bu. 


Let U equal the span in V of the list u,v. Then U is a subspace of V 
with dimension 1 or 2. The equations above show that U is invariant under T, 
completing the proof. E 


The Minimal Polynomial of the Complexification 


Suppose V is a real vector space and T € L(V). Repeated application of the 
definition of Tc shows that 


9.9 (Tc)"(ut+iv) = T”u +iT”v 


for every positive integer n and all u,v € V. 
Notice that the next result implies that the minimal polynomial of Tc has 
real coefficients. 


9.10 Minimal polynomial of Tc equals minimal polynomial of T 


Suppose V is a real vector space and T € L(V). Then the minimal 
polynomial of Tc equals the minimal polynomial of T. 


Proof Let p € P(R) denote the minimal polynomial of T. From 9.9 it is 
easy to see that p(Tc) = (PD )c and thus p(Tc) = 0. 

Suppose q € P(C) is a monic polynomial such that q4(Tc) = 0. Then 
(q(Tc)) (u) = 0 for every u € V. Letting r denote the polynomial whose j 
coefficient is the real part of the j ® coefficient of q, we see that r is a monic 
polynomial and r (T) = 0. Thus deg g = degr > deg p. 

The conclusions of the two previous paragraphs imply that p is the minimal 
polynomial of Tc, as desired. m 
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Eigenvalues of the Complexification 


Now we turn to questions about the eigenvalues of the complexification of an 
operator. Again, everything that we expect to work indeed works easily. 

We begin with a result showing that the real eigenvalues of T¢ are precisely 
the eigenvalues of T. We give two different proofs of this result. The first 
proof is more elementary, but the second proof is shorter and gives some 
useful insight. 


9.11 Real eigenvalues of Te 


Suppose V is a real vector space, T € L(V), and A € R. Then A is an 
eigenvalue of Tc if and only if À is an eigenvalue of T. 


Proof 1 First suppose A is an eigenvalue of T. Then there exists v € V 
with v Æ 0 such that Tv = Av. Thus Tcv = Av, which shows that A is an 
eigenvalue of Tc, completing one direction of the proof. 

To prove the other direction, suppose now that A is an eigenvalue of Tc. 
Then there exist u,v € V with u + iv Æ 0 such that 


Tc(u + iv) = À(u + iv). 


The equation above implies that Tu = Au and Tv = Av. Because u Æ 0 or 
v Æ 0, this implies that À is an eigenvalue of T, completing the proof. E 


Proof2 The (real) eigenvalues of T are the (real) zeros of the minimal 
polynomial of T (by 8.49). The real eigenvalues of Tc are the real zeros of the 
minimal polynomial of Tc (again by 8.49). These two minimal polynomials 
are the same (by 9.10). Thus the eigenvalues of T are precisely the real 
eigenvalues of Tc, as desired. E 


Our next result shows that Tc behaves symmetrically with respect to an 
eigenvalue A and its complex conjugate À. 


J2 Ma AN Enel e ~ A 


Suppose V is a real vector space, T € L(V), A € C, j is a nonnegative 
integer, and u,v € V. Then 


(Tc —ADi(u+iv) =0 ifandonlyif (To — AI) (u— iv) = 0. 
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Proof We will prove this result by induction on j. To get started, note that 
if j = 0 then (because an operator raised to the power 0 equals the identity 
operator) the result claims that u + iv = 0 if and only if u — iv = 0, which is 
clearly true. 

Thus assume by induction that j > 1 and the desired result holds for j — 1. 
Suppose (Tc — AI)/ (u + iv) = 0. Then 


9.13 (Te —ADI71 (Te — AD (u + iv)) = 0. 
Writing A = a + bi, where a,b € R, we have 
9.14 (Tc — àI )(u + iv) = (Tu — au + bv) + i(Tv—av — bu) 
and 
9.15 (Tc —AI)(u —iv) = (Tu — au + bv) — i (Tv — av — bu). 
Our induction hypothesis, 9.13, and 9.14 imply that 

(Te — àI)! ((Tu — au + bv) — i (Tv — av — bu)) = 0. 


Now the equation above and 9.15 imply that (Tc — ÀI) (u — iv) = 0, 
completing the proof in one direction. 

The other direction is proved by replacing 4 with À, replacing v with —v, 
and then using the first direction. m 


An important consequence of the result above is the next result, which 
states that if a number is an eigenvalue of Tc, then its complex conjugate is 
also an eigenvalue of Tc. 


9.16 Nonreal eigenvalues of Tc come in pairs 


Suppose V is a real vector space, T € L(V), and A € C. Then A is an 
eigenvalue of Tc if and only if À is an eigenvalue of Tc. 


Proof Take j = 1 in 9.12. m 


By definition, the eigenvalues of an operator on a real vector space are 
real numbers. Thus when mathematicians sometimes informally mention the 
complex eigenvalues of an operator on a real vector space, what they have in 
mind is the eigenvalues of the complexification of the operator. 

Recall that the multiplicity of an eigenvalue is defined to be the dimension 
of the generalized eigenspace corresponding to that eigenvalue (see 8.24). The 
next result states that the multiplicity of an eigenvalue of a complexification 
equals the multiplicity of its complex conjugate. 
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9.17 Multiplicity of A equals multiplicity of A 


Suppose V is areal vector space, T € L(V), and A € C is an eigenvalue 
of Tc. Then the multiplicity of A as an eigenvalue of Tc equals the 
multiplicity of A as an eigenvalue of Tc. 


Proof Suppose uy + ivy,...,Um + ivm is a basis of the generalized 
eigenspace G(A, Tc), where u1,...,Um,V1,..-,V¥m E V. Then using 9.12, 
routine arguments show that wy — iv1,..., Um — İVm is a basis of the gen- 
eralized eigenspace G(A, Tc). Thus both A and À have multiplicity m as 
eigenvalues of Te. E 


9.18 Example Suppose T € £(R?) is defined by 
T (x1, X2, X3) = (2x1, X2 — X3, X2 + x3). 
2 0 0 
The matrix of T with respect to the standard basis of Rĉôis | 0 1 —1 
O 1 1 
As you can verify, 2 is an eigenvalue of T with multiplicity 1 and T has no 
other eigenvalues. 

If we identify the complexification of R? with C°, then the matrix of Te 
with respect to the standard basis of C? is the matrix above. As you can 
verify, the eigenvalues of Tc are 2, 1 + i, and 1 — i, each with multiplicity 
1. Thus the nonreal eigenvalues of Tc come as a pair, with each the complex 
conjugate of the other and with the same multiplicity, as expected by 9.17. 


We have seen an example [5.8(a)] of an operator on R? with no eigenvalues. 
The next result shows that no such example exists on R°. 


9.19 Operator on odd-dimensional vector space has eigenvalue 


Every operator on an odd-dimensional real vector space has an eigenvalue. 


Proof Suppose V is a real vector space with odd dimension and T € L(V). 
Because the nonreal eigenvalues of Tc come in pairs with equal multiplicity 
(by 9.17), the sum of the multiplicities of all the nonreal eigenvalues of T¢ is 
an even number. 

Because the sum of the multiplicities of all the eigenvalues of Tc equals 
the (complex) dimension of Vc (by Theorem 8.26), the conclusion of the 
paragraph above implies that Tc has a real eigenvalue. Every real eigenvalue 
of Tc is also an eigenvalue of T (by 9.11), giving the desired result. E 
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Characteristic Polynomial of the Complexification 


In the previous chapter we defined the characteristic polynomial of an operator 
on a finite-dimensional complex vector space (see 8.34). The next result is 
a key step toward defining the characteristic polynomial for operators on 
finite-dimensional real vector spaces. 


9.20 Characteristic polynomial of Tc 


Suppose V is a real vector space and T € L(V). Then the coefficients of 
the characteristic polynomial of T¢ are all real. 


Proof Suppose A is a nonreal eigenvalue of Tc with multiplicity m. Then his 
also an eigenvalue of Tc with multiplicity m (by 9.17). Thus the characteristic 
polynomial of Tc includes factors of (z — 4)” and (z — A)”. Multiplying 
together these two factors, we have 


(z —A)™(z —A)™ = (z? —2(ReA)z + |A|?)”. 


The polynomial above on the right has real coefficients. 

The characteristic polynomial of T¢ is the product of terms of the form 
above and terms of the form (z — t)?, where ¢ is a real eigenvalue of Tc with 
multiplicity d. Thus the coefficients of the characteristic polynomial of Tc 
are all real. E 


Now we can define the characteristic polynomial of an operator on a 
finite-dimensional real vector space to be the characteristic polynomial of its 
complexification. 


9.21 Definition Characteristic polynomial 


Suppose V is a real vector space and T € L(V). Then the characteristic 
polynomial of T is defined to be the characteristic polynomial of Tc. 


9.22 Example Suppose T € £(R?) is defined by 
T (x1, X2, X3) = (2x1, X2 — x3, X2 + x3). 


As we noted in 9.18, the eigenvalues of Tc are 2, 1 + i, and 1 — i, each with 
multiplicity 1. Thus the characteristic polynomial of the complexification Te 
is (Z — 2)(z — (1+ i))(z -(- i)), which equals z3 — 4z? + 6z — 4. Hence 
the characteristic polynomial of T is also z3 — 4z? + 6z — 4. 
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In the next result, the eigenvalues of T are all real (because T is an operator 
on a real vector space). 


9.23 Degree and zeros of characteristic polynomial 


Suppose V is areal vector space and T € L(V). Then 


(a) the coefficients of the characteristic polynomial of T are all real; 
(b) the characteristic polynomial of T has degree dim V; 


(c) the eigenvalues of T are precisely the real zeros of the characteristic 
polynomial of T. 


Proof Part (a) holds because of 9.20. 

Part (b) follows from 8.36(a). 

Part (c) holds because the real zeros of the characteristic polynomial of T 
are the real eigenvalues of Tc [by 8.36(a)], which are the eigenvalues of T 
(by 9.11). m 


In the previous chapter, we proved the Cayley-Hamilton Theorem (8.37) 
for complex vector spaces. Now we can also prove it for real vector spaces. 


9.24 Cayley—Hamilton Theorem 


Suppose T € L(V). Let q denote the characteristic polynomial of T. 
Then q(T) = 0. 


Proof We have already proved this result when V is a complex vector space. 
Thus assume that V is a real vector space. 

The complex case of the Cayley-Hamilton Theorem (8.37) implies that 
q(Tc) = 0. Thus we also have q(T) = 0, as desired. E 


9.25 Example Suppose T € £(R°) is defined by 
T (x1, X2, X3) = (2x1, X2 — x3, X2 + x3). 


As we saw in 9.22, the characteristic polynomial of T is z? — 4z? + 6z — 4. 
Thus the Cayley-Hamilton Theorem implies that T? — 4T? + 6T — 4I = 0, 
which can also be verified by direct calculation. 


We can now prove another result that we previously knew only in the 
complex case. 
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9.26 Characteristic polynomial is a multiple of minimal polynomial 
Suppose T € L(V). Then 


(a) 
(b) 


Proof 
Part (b) follows from the Cayley—-Hamilton Theorem and 8.46. n 


the degree of the minimal polynomial of T is at most dim V; 


the characteristic polynomial of T is a polynomial multiple of the 
minimal polynomial of T. 


Part (a) follows immediately from the Cayley-Hamilton Theorem. 


EXERCISES 9.A 


10 


Prove 9.3. 
Verify that if V is areal vector space and T € L(V), then Te € L(Ve). 


Suppose V is a real vector space and v1,...,V¥m E V. Prove that 
Vilis x45 Vm is linearly independent in Vc if and only if v1,..., Vv is 
linearly independent in V. 


Suppose V is a real vector space and v1,...,V¥m E V. Prove that 
Va gecess Vm Spans Vc if and only if v1,..., vm spans V. 


Suppose that V is a real vector space and S,T € L(V). Show that 
(S + T)c = Sc + Te and that (AT)c = AT¢ for every A € R. 


Suppose V is a real vector space and T € L(V). Prove that Te is 
invertible if and only if T is invertible. 


Suppose V is a real vector space and N € L(V). Prove that Nc is 
nilpotent if and only if N is nilpotent. 


Suppose T € £(R*) and 5,7 are eigenvalues of T. Prove that Tc has no 
nonreal eigenvalues. 


Prove there does not exist an operator T € L(R7) such that T? + T + J 
is nilpotent. 


Give an example of an operator T € L(C’) such that T? +T +1 
is nilpotent. 
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Suppose V is a real vector space and T € L(V). Suppose there exist 
b,c € R such that T? + bT + cI = 0. Prove that T has an eigenvalue 
if and only if b? > 4c. 


Suppose V is a real vector space and T € L(V). Suppose there exist 
b,c € R such that b? < 4c and T? + bT + cI is nilpotent. Prove that 
T has no eigenvalues. 


Suppose V is a real vector space, T € L(V), and b,c € R are such that 
b? < 4c. Prove that null(T? + bT +c1)/ has even dimension for every 
positive integer j. 


Suppose V is a real vector space with dim V = 8. Suppose T € L(V) 
is such that T? + T + J is nilpotent. Prove that (T? + T + 1)* = 0. 


Suppose V is a real vector space and T € L(V) has no eigenvalues. 
Prove that every subspace of V invariant under T has even dimension. 


Suppose V is a real vector space. Prove that there exists T € L(V) such 
that T? = —] if and only if V has even dimension. 
Suppose V is a real vector space and T € L(V) satisfies T? = —J. 


Define complex scalar multiplication on V as follows: if a,b € R, then 
(a+ bi)v = av + bTv. 
(a) Show that the complex scalar multiplication on V defined above 


and the addition on V makes V into a complex vector space. 


(b) Show that the dimension of V as a complex vector space is half 
the dimension of V as a real vector space. 


Suppose V is areal vector space and T € L(V). Prove that the following 
are equivalent: 


(a) Al the eigenvalues of Tc are real. 
(b) There exists a basis of V with respect to which T has an upper- 
triangular matrix. 


(c) There exists a basis of V consisting of generalized eigenvectors 
of T. 


Suppose V is a real vector space with dim V = n and T € L(V) is 
such that null T”? Æ null 7”~!. Prove that T has at most two distinct 
eigenvalues and that Tc has no nonreal eigenvalues. 


SECTION 9.B Operators on Real Inner Product Spaces 287 


9.B | Operators on Real Inner Product Spaces 


We now switch our focus to the context of inner product spaces. We will give 
a complete description of normal operators on real inner product spaces; a 
key step in the proof of this result (9.34) requires the result from the previous 
section that an operator on a finite-dimensional real vector space has an 
invariant subspace of dimension 1 or 2 (9.8). 

After describing the normal operators on real inner product spaces, we will 
use that result to give a complete description of isometries on such spaces. 


Normal Operators on Real Inner Product Spaces 


The Complex Spectral Theorem (7.24) gives a complete description of normal 
operators on complex inner product spaces. In this subsection we will give a 
complete description of normal operators on real inner product spaces. 

We begin with a description of the operators on 2-dimensional real inner 
product spaces that are normal but not self-adjoint. 


9.27 Normal but not self-adjoint operators 


Suppose V is a 2-dimensional real inner product space and T € L(V). 
Then the following are equivalent: 


(a) T is normal but not self-adjoint. 


(b) The matrix of T with respect to every orthonormal basis of V has 


the form 
a —b 
b a ; 
with b Æ 0. 
(c) The matrix of T with respect to some orthonormal basis of V has 
the form 
a —b 
b a í 
with b > 0. 


Proof First suppose (a) holds, so that T is normal but not self-adjoint. Let 
€1, €2 be an orthonormal basis of V. Suppose 


9.28 M(T, (e1, e2)) = ( a i 
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Then ||Te,||?7 = a? + b? and ||T*e ||? = a? + c?. Because T is normal, 


\|Te1|| = ||T*e1 || (see 7.20); thus these equations imply that b? = c?. Thus 
c = b or c = —b. But c + b, because otherwise T would be self-adjoint, as 
can be seen from the matrix in 9.28. Hence c = —b, so 

a —b 
9.29 M(T, (e1, €2)) = ( b d ). 


The matrix of 7* is the transpose of the matrix above. Use matrix multipli- 
cation to compute the matrices of T T* and T*T (do it now). Because T is 
normal, these two matrices are equal. Equating the entries in the upper-right 
corner of the two matrices you computed, you will discover that bd = ab. 
Now b Æ 0, because otherwise T would be self-adjoint, as can be seen from 
the matrix in 9.29. Thus d = a, completing the proof that (a) implies (b). 

Now suppose (b) holds. We want to prove that (c) holds. Choose an 
orthonormal basis e1, e2 of V. We know that the matrix of T with respect to 
this basis has the form given by (b), with b 4 0. If b > 0, then (c) holds 
and we have proved that (b) implies (c). If b < 0, then, as you should verify, 
the matrix of T with respect to the orthonormal basis e1, —e2 equals ( ey 4 ), 
where —b > 0; thus in this case we also see that (b) implies (c). 

Now suppose (c) holds, so that the matrix of T with respect to some 
orthonormal basis has the form given in (c) with b > 0. Clearly the matrix 
of T is not equal to its transpose (because b # 0). Hence T is not self-adjoint. 
Now use matrix multiplication to verify that the matrices of T T* and T*T 
are equal. We conclude that TT* = T*T. Hence T is normal. Thus (c) 
implies (a), completing the proof. E 


The next result tells us that a normal operator restricted to an invariant 
subspace is normal. This will allow us to use induction on dim V when we 
prove our description of normal operators (9.34). 


9.30 Normal operators and invariant subspaces 


Suppose V is an inner product space, T € L(V) is normal, and U is a 
subspace of V that is invariant under T. Then 


(a) Ut is invariant under T; 

(b) U is invariant under 7*; 

(©) (Tlv)* = (Tl; 

(d) T]y € LU) and T|y1 € L(U H) are normal operators. 
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Proof First we will prove (a). Let e1,...,@ be an orthonormal basis 
of U. Extend to an orthonormal basis e1,..., €m, f1,---, Jn of V (this is 
possible by 6.35). Because U is invariant under T, each Te; is a linear 
combination of e1,...,@m. Thus the matrix of T with respect to the basis 
€1,---,€m; fi, ---, Jn is of the form 


ej sae em fi see Sh 


el 
. A B 
M(T) = i: 
: 0 C 
fa 


here A denotes an m-by-m matrix, 0 denotes the n-by-m matrix of all 0’s, B 
denotes an m-by-n matrix, C denotes an n-by-n matrix, and for convenience 
the basis has been listed along the top and left sides of the matrix. 

For each j € {1,...,m}, ||Te;||? equals the sum of the squares of the 
absolute values of the entries in the j column of A (see 6.25). Hence 


m 
„2 _ the sum of the squares of the absolute 
Ssa 2 Pej" = values of the entries of A. 
J = 
For each j € {1,...,m}, ||T*e; ||? equals the sum of the squares of the 
absolute values of the entries in the j rows of A and B. Hence 


the sum of the squares of the absolute 
values of the entries of A and B. 


m 
9.32 X Tas 
j=1 
Because T is normal, ||Te; || = ||T*e; || for each j (see 7.20); thus 
m m 
Wires? = $ IT*e;1. 
j=1 j=1 


This equation, along with 9.31 and 9.32, implies that the sum of the squares 
of the absolute values of the entries of B equals 0. In other words, B is the 
matrix of all 0’s. Thus 
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Cj ser Cm fi PET Sa 


ei 
A 0 
em 
9.33 M(T) = 
fi 
; 0 C 
Ín 
This representation shows that T fẹ is in the span of fi,..., Jn for each k. 
Because fi,..., fn is a basis of U+, this implies that Tv € U+ whenever 


v € UŁ. In other words, U + is invariant under T, completing the proof of (a). 
To prove (b), note that M(T*), which is the conjugate transpose of M (T), 
has a block of 0’s in the lower left corner (because M (T), as given above, has 
a block of 0’s in the upper right corner). In other words, each T*e; can be 
written as a linear combination of e1, ...,em. Thus U is invariant under T*, 
completing the proof of (b). 
To prove (c), let S = T|y € £(U). Fix v € U. Then 


(Su,v) = (Tu,v) 
= (u, T*v) 


for all u € U. Because T*v € U [by (b)], the equation above shows that 
S*v = T*y. In other words, (T|y)* = (T*)|y, completing the proof of (c). 

To prove (d), note that T commutes with T* (because T is normal) and 
that (T|y)* = (T*)|u [by (c)]. Thus T|y commutes with its adjoint and 
hence is normal. Interchanging the roles of U and U+, which is justified by 


(a), shows that T |y is also normal, completing the proof of (d). E 

Note that if an operator T has a | Our next result shows that normal 

block diagonal matrix with respect operators on real inner product spaces 

to some basis, then the entry in come close to having diagonal matrices. 

each 1-by-1 block on the diagonal Specifically, we get block diagonal ma- 

of this matrix is an eigenvalue of T| trices, with each block having size at 
most 2-by-2. 


We cannot expect to do better than the next result, because on a real inner 
product space there exist normal operators that do not have a diagonal matrix 
with respect to any basis. For example, the operator T € L(R?) defined by 
T(x, y) = (—y, x) is normal (as you should verify) but has no eigenvalues; 
thus this particular T does not have even an upper-triangular matrix with 
respect to any basis of R?. 
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9.34 Characterization of normal operators when F = R 


Suppose V is a real inner product space and T € L(V). Then the follow- 
ing are equivalent: 


(a) T is normal. 


(b) There is an orthonormal basis of V with respect to which T has a 
block diagonal matrix such that each block is a 1-by-1 matrix or a 
2-by-2 matrix of the form 


a 


Proof First suppose (b) holds. With respect to the basis given by (b), the 
matrix of T commutes with the matrix of T* (which is the transpose of the 
matrix of T), as you should verify (use Exercise 9 in Section 8.B for the 
product of two block diagonal matrices). Thus T commutes with T*, which 
means that T is normal, completing the proof that (b) implies (a). 

Now suppose (a) holds, so T is normal. We will prove that (b) holds 
by induction on dim V. To get started, note that our desired result holds if 
dim V = 1 (trivially) or if dim V = 2 [if T is self-adjoint, use the Real 
Spectral Theorem (7.29); if T is not self-adjoint, use 9.27]. 

Now assume that dim V > 2 and that the desired result holds on vector 
spaces of smaller dimension. Let U be a subspace of V of dimension 1 that 
is invariant under T if such a subspace exists (in other words, if T has an 
eigenvector, let U be the span of this eigenvector). If no such subspace exists, 
let U be a subspace of V of dimension 2 that is invariant under T (an invariant 
subspace of dimension 1 or 2 always exists by 9.8). 

If dimU = 1, choose a vector in U with norm 1; this vector will be 
an orthonormal basis of U, and of course the matrix of T|y € L(U) isa 
1-by-1 matrix. If dim U = 2, then T|y € L(V) is normal (by 9.30) but not 
self-adjoint (otherwise T |y, and hence T, would have an eigenvector by 7.27). 
Thus we can choose an orthonormal basis of U with respect to which the 
matrix of T|y € £(U) has the required form (see 9.27). 

Now U+ is invariant under T and T|y is a normal operator on U = 
(by 9.30). Thus by our induction hypothesis, there is an orthonormal basis 
of U+ with respect to which the matrix of T | y- has the desired form. Adjoin- 
ing this basis to the basis of U gives an orthonormal basis of V with respect 
to which the matrix of T has the desired form. Thus (b) holds. C] 


with b > 0. 
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Isometries on Real Inner Product Spaces 


As we will see, the next example is a key building block for isometries on real 
inner product spaces. Also, note that the next example shows that an isometry 
on R? may have no eigenvalues. 


9.35 Example Let 0 € R. Then the operator on R? of counterclockwise 
rotation (centered at the origin) by an angle of 0 is an isometry, as is geomet- 
rically obvious. The matrix of this operator with respect to the standard basis 


cos —siné 

sinô cos@ jJ’ 
If 6 is not an integer multiple of zr, then no nonzero vector of R? gets mapped 
to a scalar multiple of itself, and hence the operator has no eigenvalues. 


is 


The next result shows that every isometry on a real inner product space is 
composed of pieces that are rotations on 2-dimensional subspaces, pieces that 
equal the identity operator, and pieces that equal multiplication by —1. 


9.36 Description of isometries when F = R 


Suppose V is areal inner product space and S € L(V). Then the following 
are equivalent: 


(a) S is an isometry. 


(b) There is an orthonormal basis of V with respect to which S has 
a block diagonal matrix such that each block on the diagonal is a 
1-by-1 matrix containing 1 or —1 or is a 2-by-2 matrix of the form 


cos@ —sin0 
sin cos@ J’ 
with 6 € (0, 7). 
Proof First suppose (a) holds, so S' is an isometry. Because S is normal, there 


is an orthonormal basis of V with respect to which S has a block diagonal 
matrix such that each block is a 1-by-1 matrix or a 2-by-2 matrix of the form 


a —b 
9.37 (; 7 


with b > 0 (by 9.34). 
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If A is an entry in a 1-by-1 matrix along the diagonal of the matrix of S 
(with respect to the basis mentioned above), then there is a basis vector ej 
such that Se; = Ae;. Because S is an isometry, this implies that |A| = 1. 
Thus A = 1 or A = —1, because these are the only real numbers with absolute 
value 1. 

Now consider a 2-by-2 matrix of the form 9.37 along the diagonal of the 
matrix of S. There are basis vectors e;, e;4+1 such that 


Sej = ae; + bej+1. 
Thus 
1 = |e; ||? = ||Se;||? = a? + b?. 


The equation above, along with the condition b > 0, implies that there exists 
a number @ € (0, x) such that a = cos 0 and b = sin @. Thus the matrix 9.37 
has the required form, completing the proof in this direction. 

Conversely, now suppose (b) holds, so there is an orthonormal basis of V 
with respect to which the matrix of S has the form required by the theorem. 
Thus there is a direct sum decomposition 


V =U; ®--- D Um, 


where each U; is a subspace of V of dimension 1 or 2. Furthermore, any two 
vectors belonging to distinct U’s are orthogonal, and each S|, is an isometry 
mapping U; into Uj. Ifv € V, we can write 


v= uy ++ um, 


where each u; is in U;. Applying S to the equation above and then taking 
norms gives 
|| Sv]? = [Sui +--+ + Sum |? 
= Sui? +--+ [Sum 
= |r|? +-+ + [fee ll? 


= |v 


Thus S' is an isometry, and hence (a) holds. = 
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EXERCISES 9.B 


Suppose S € £L(R°) is an isometry. Prove that there exists a nonzero 
vector x € R? such that S?x = x. 


Prove that every isometry on an odd-dimensional real inner product space 
has 1 or —1 as an eigenvalue. 


Suppose V is a real inner product space. Show that 
(u +iv,x + iy) = (u,x) + (v, y) + ((v, x) — (u, y))i 
for u,v, x, y € V defines a complex inner product on Vc. 


Suppose V is a real inner product space and T € L(V) is self-adjoint. 
Show that Tc is a self-adjoint operator on the inner product space Ve 
defined by the previous exercise. 


Use the previous exercise to give a proof of the Real Spectral Theorem 
(7.29) via complexification and the Complex Spectral Theorem (7.24). 


Give an example of an operator T on an inner product space such that T 
has an invariant subspace whose orthogonal complement is not invariant 
under T. 

[The exercise above shows that 9.30 can fail without the hypothesis that 
T is normal.] 


Suppose T € L(V) and T has a block diagonal matrix 


Ay 0 
0 Am 
with respect to some basis of V. For j = 1,...,m, let T; be the operator 


on V whose matrix with respect to the same basis is a block diagonal 
matrix with blocks the same size as in the matrix above, with A; in the 
j block, and with all the other blocks on the diagonal equal to identity 
matrices (of the appropriate size). Prove that T = T; --- Tm. 


Suppose D is the differentiation operator on the vector space V in 
Exercise 21 in Section 7.A. Find an orthonormal basis of V such that 
the matrix of the normal operator D has the form promised by 9.34. 


CHAPTER 


British mathematician and pioneer 
computer scientist Ada Lovelace 
(1815-1852), as painted by Alfred 
Chalon in this 1840 portrait. 


Trace and Determinant 


Throughout this book our emphasis has been on linear maps and operators 
rather than on matrices. In this chapter we pay more attention to matrices as 
we define the trace and determinant of an operator and then connect these 
notions to the corresponding notions for matrices. The book concludes with 
an explanation of the important role played by determinants in the theory of 
volume and integration. 

Our assumptions for this chapter are as follows: 


10.1 Notation F, V 


e F denotes R or C. 


e V denotes a finite-dimensional nonzero vector space over F. 


LEARNING OBJECTIVES FOR THIS CHAPTER 
m change of basis and its effect upon the matrix of an operator 
m trace of an operator and of a matrix 
m determinant of an operator and of a matrix 


m determinants and volume 
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10.A | Trace 


For our study of the trace and determinant, we will need to know how the 
matrix of an operator changes with a change of basis. Thus we begin this 
chapter by developing the necessary material about change of basis. 


Change of Basis 


With respect to every basis of V, the matrix of the identity operator J € L(V) 
is the diagonal matrix with 1’s on the diagonal and 0’s elsewhere. We also use 
the symbol / for the name of this matrix, as shown in the next definition. 


10.2 Definition identity matrix, I 
Suppose n is a positive integer. The n-by-n diagonal matrix 


1 0 
0 1 
is called the identity matrix and is denoted 7. 


Note that we use the symbol Z to denote the identity operator (on all vector 
spaces) and the identity matrix (of all possible sizes). You should always be 
able to tell from the context which particular meaning of J is intended. For 
example, consider the equation M(/) = I; on the left side Z denotes the 
identity operator, and on the right side 7 denotes the identity matrix. 

If A is a square matrix (with entries in F, as usual) with the same size as J, 
then AJ = JA = A, as you should verify. 


10.3 Definition invertible, inverse, A~! 


A square matrix A is called invertible if there is a square matrix B of 
the same size such that AB = BA = I; we call B the inverse of A and 
denote it by A~!. 


Some mathematicians use the The same proof as used in 3.54 
terms nonsingular, which means | shows that if A is an invertible square 
the same as invertible, and | matrix, then there is a unique matrix B 
singular, which means the same}, such that AB = BA = I (and thus the 
as noninvertible. [| notation B = A7! is justified). 
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In Section 3.C we defined the matrix of a linear map from one vector space 
to another with respect to two bases—one basis of the first vector space and 
another basis of the second vector space. When we study operators, which are 
linear maps from a vector space to itself, we almost always use the same basis 
for both vector spaces (after all, the two vector spaces in question are equal). 
Thus we usually refer to the matrix of an operator with respect to a basis and 
display at most one basis because we are using one basis in two capacities. 

The next result is one of the unusual cases in which we use two different 
bases even though we have operators from a vector space to itself. It is just a 
convenient restatement of 3.43 (with U and W both equal to V), but now we 
are being more careful to include the various bases explicitly in the notation. 
The result below holds because we defined matrix multiplication to make it 
true—see 3.43 and the material preceding it. 


10.4 The matrix of the product of linear maps 


Suppose u1,..., un and v1,...,Vn and w1,..., Wy are all bases of V. 
Suppose S, T € L(V). Then 


M(ST, (u1,...,Un), (W1, ..-,Wn)) = 
M(S, (1, -.., Vn), W1,..., Wn) MT, Gig a3 in) ie oad): 


The next result deals with the matrix of the identity operator J with 
respect to two different bases. Note that the k" column of the matrix 
M(I, (Uy,...,Un),(V1,..., vn) consists of the scalars needed to write ug 
as a linear combination of v1,..., Vn. 


10.5 Matrix of the identity with respect to two bases 


Suppose u1,..., un and v1,...,Vn are bases of V. Then the matrices 
WAU Gee ea) ieee va) and AVG (Ong snc Hao Cine saad i) 
are invertible, and each is the inverse of the other. 


Proof In 10.4, replace w; with u j, and replace S and T with /, getting 

I = M(I, 1,- <, Vn), U1,- --,Un))M(I, igs os Un), isn) 
Now interchange the roles of the u’s and v’s, getting 

I = M(I, (u1, ..., un), Cts Va) ML, OL eng Vn), ts one Un): 


These two equations give the desired result. m 
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10.6 Example Consider the bases (4,2), (5,3) and (1,0), (0, 1) of F?. 
Obviously 


M(I, (4,2), (5,3), (0,0), 0, D)) = ( : : i 


because /(4, 2) = 4(1,0) + 2(0, 1) and 7(5,3) = 5(1,0) + 3(0, 1). 
The inverse of the matrix above is 


3 
2 
& 2 


as you should verify. Thus 10.5 implies that 


NIM 
———— 


3 


3 _5 
M(I, (0.0), ©. D), ((4.2).6.3))) = ( a ) 


Now we can see how the matrix of T changes when we change bases. In 
the result below, we have two different bases of V. Recall that the notation 
M(T, (uy,...,Un)) is shorthand for M(T, (u1,..., Un), (U1,..-,Un)) 


10.7 Change of basis formula 


Suppose T € L(V). Let u1,..., un and v1,...,Vn be bases of V. Let 
A = NAL ING) (acelin) aera Ve) Tiren 


M(T, (u1, ...,un)) Ae A(T tn vn))A. 


Proof In 10.4, replace w; with u; and replace S with /, getting 
10.8 = M(T,(w1,...,un)) = ATM (T, (u1,..-, Un), Y1,- vn)), 


where we have used 10.5. 
Again use 10.4, this time replacing w; with v;. Also replace T with J and 
replace S with T, getting 


M(T, (u1, ., Un), V1,- -3 Vn)) — M(T, (Mis... Vn))A. 


Substituting the equation above into 10.8 gives the desired result. 7 
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Trace: A Connection Between Operators and Matrices 


Suppose T €e L(V) and A is an eigenvalue of T. Letn = dimV. Re- 
call that we defined the multiplicity of A to be the dimension of the gen- 
eralized eigenspace G(A, T) (see 8.24) and that this multiplicity equals 
dim null(7T — AJ)” (see 8.11). Recall also that if V is a complex vector 
space, then the sum of the multiplicities of all the eigenvalues of T equals n 
(see 8.26). 

In the definition below, the sum of the eigenvalues “with each eigenvalue 
repeated according to its multiplicity” means that if A1,..., Am are the distinct 
eigenvalues of T (or of Te if V is a real vector space) with multiplicities 
d,,...,d@m, then the sum is 


day +--+ +dmdm. 


Or if you prefer to list the eigenvalues with each repeated according to its 
multiplicity, then the eigenvalues could be denoted A1,...,A, (where the 
index n equals dim V) and the sum is 


Apter Àn. 


10.9 Definition trace of an operator 
Suppose T € L(V). 


e If F = C, then the trace of T is the sum of the eigenvalues of T, 
with each eigenvalue repeated according to its multiplicity. 


e If F = R, then the trace of T is the sum of the eigenvalues of Tc, 
with each eigenvalue repeated according to its multiplicity. 


The trace of T is denoted by trace T. 


10.10 Example Suppose T € £(C?) is the operator whose matrix is 


3 =1. =2 
3 2 -3 
1 2 0 


Then the eigenvalues of T are 1, 2 + 3i, and 2 — 3i, each with multiplicity 1, 
as you can verify. Computing the sum of the eigenvalues, we find that 
trace T = 1 + (2 + 3i) + (2 — 3i); in other words, trace T = 5. 
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The trace has a close connection with the characteristic polynomial. Sup- 
pose À1,..., Àn are the eigenvalues of T (or of Te if V is a real vector space) 
with each eigenvalue repeated according to its multiplicity. Then by definition 
(see 8.34 and 9.21), the characteristic polynomial of T equals 


(z — A1) +: (Z — ån). 


Expanding the polynomial above, we can write the characteristic polynomial 
of T in the form 


10.11 z” — (M1 tattle” + -e + 1)" (A An). 


The expression above immediately leads to the following result. 


10.12 Trace and characteristic polynomial 


Suppose T € L(V). Let n = dim V. Then trace T equals the negative of 
the coefficient of z”—! in the characteristic polynomial of T. 


Most of the rest of this section is devoted to discovering how to compute 
trace T from the matrix of T (with respect to an arbitrary basis). 

Let’s start with the easiest situation. Suppose V is a complex vector space, 
T € L(V), and we choose a basis of V as in 8.29. With respect to that basis, 
T has an upper-triangular matrix with the diagonal of the matrix containing 
precisely the eigenvalues of T, each repeated according to its multiplicity. 
Thus trace T equals the sum of the diagonal entries of M (T) with respect to 
that basis. 

The same formula works for the operator T € £(C?) in Example 10.10 
whose trace equals 5. In that example, the matrix is not in upper-triangular 
form. However, the sum of the diagonal entries of the matrix in that example 
equals 5, which is the trace of the operator 7. 

At this point you should suspect that trace T equals the sum of the diagonal 
entries of the matrix of T with respect to an arbitrary basis. Remarkably, this 
suspicion turns out to be true. To prove it, we start by making the following 
definition. 


10.13 Definition trace of a matrix 


The trace of a square matrix A, denoted trace A, is defined to be the sum 
of the diagonal entries of A. 
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Now we have defined the trace of an operator and the trace of a square 
matrix, using the same word “trace” in two different contexts. This would be 
bad terminology unless the two concepts turn out to be essentially the same. 


As we will see, it is indeed true that trace T = trace M(T, (¥1,... ,Vn)); 
where v1, ...,Vn is an arbitrary basis of V. We will need the following result 
for the proof. 


10.14 Trace of AB equals trace of BA 


If A and B are square matrices of the same size, then 


trace(A B) = trace( BA). 


Proof Suppose 


Ai pe Ain Bia tas Bin 


An,1 ert Ann Bn sae Ban 


The j™ term on the diagonal of AB equals 


n 
> Aj Bj. 
k=1 
Thus 


n n 
trace(AB) = $ N° Aj Be,j 


j=1k=1 


n n 
= J) BR Aik 


k=1 j=1 


n 
= > k" term on the diagonal of BA 
k=1 
= trace(BA), 


as desired. E 


Now we can prove that the sum of the diagonal entries of the matrix of 
an operator is independent of the basis with respect to which the matrix is 
computed. 
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10.15 Trace of matrix of operator does not depend on basis 


Let T € L(V). Suppose u1, ..., un and v1,..., Vy, are bases of V. Then 


trace MTG, .,un)) = trace M(T, (v1, . <, Vn)). 


Proof Let A = M(I, (u1, ..., un), W1, -.-,Vn)). Then 


trace M (T, (u1, ...,un)) = trace( A7! (M(T, vi,- .vn))A)) 


= trace( (M (T, (vi, .. -s vn)) A) A7") 
= trace M(T, (vi,... Vn): 


where the first equality comes from 10.7 and the second equality follows 
from 10.14. The third equality completes the proof. E 


The result below, which is the most important result in this section, states 
that the trace of an operator equals the sum of the diagonal entries of the 
matrix of the operator. This theorem does not specify a basis because, by the 
result above, the sum of the diagonal entries of the matrix of an operator is 
the same for every choice of basis. 


10.16 Trace of an operator equals trace of its matrix 
Suppose T € L(V). Then trace T = trace M (T). 


Proof As noted above, trace M (T) is independent of which basis of V we 
choose (by 10.15). Thus to show that 


trace T = trace M (T) 


for every basis of V, we need only show that the equation above holds for 
some basis of V. 

As we have already discussed, if V is a complex vector space, then choos- 
ing the basis as in 8.29 gives the desired result. If V is a real vector space, 
then applying the complex case to the complexification T¢ (which is used to 
define trace T) gives the desired result. E 


If we know the matrix of an operator on a complex vector space, the result 
above allows us to find the sum of all the eigenvalues without finding any of 
the eigenvalues, as shown by the next example. 
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10.17 Example Consider the operator on C whose matrix is 


00 0 0 -3 
1000 6 
010 0 0 
0010 0 
000 1 0 


No one can find an exact formula for any of the eigenvalues of this operator. 
However, we do know that the sum of the eigenvalues equals 0, because the 
sum of the diagonal entries of the matrix above equals 0. 


We can use 10.16 to give easy proofs of some useful properties about 
traces of operators by shifting to the language of traces of matrices, where 
certain properties have already been proved or are obvious. The proof of the 
next result is an example of this technique. The eigenvalues of S + T are not, 
in general, formed from adding together eigenvalues of S and eigenvalues of 
T. Thus the next result would be difficult to prove without using 10.16. 


10.18 Trace is additive 
Suppose S, 7 € L(V). Then trace(S + T) = trace S + trace T. 


Proof Choose a basis of V. Then 


trace(S + T) = trace M(S + T) 
= trace(M(S) + M(T)) 
= trace M(S) + trace M(T) 
= trace S + trace T, 


where again the first and last equalities come from 10.16; the third equality is 
obvious from the definition of the trace of a matrix. E 


The techniques we have developed [The statement of the next result 
have the following curious consequence. | does not involve traces, although 
A generalization of this result to infinite- | the short proof uses traces. When- 
dimensional vector spaces has impor- | ever something like this happens in 
tant consequences in modern physics, |”4thematics, we can be sure that 
particularly in quantum theory. a good definition lurks in the back- 
ground. 
ee 
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10.19 The identity is not the difference of ST and TS 
There do not exist operators S, T € L(V) such that ST — TS = I. 


Proof 


Suppose S, 7 € L(V). Choose a basis of V. Then 


trace(ST — TS) = trace(ST) — trace(TS) 
= trace M(ST) — trace M(TS) 
= trace(M (S)M (T)) — trace(M (T)M (S)) 
= 0, 


where the first equality comes from 10.18, the second equality comes from 
10.16, the third equality comes from 3.43, and the fourth equality comes from 
10.14. Clearly the trace of J equals dim V, which is not 0. Because ST — TS 
and J have different traces, they cannot be equal. 7 


EXERCISES 10.A 


Suppose T € L(V) and v1,..., Vn is a basis of V. Prove that the matrix 
M(T, (Viscous vn)) is invertible if and only if T is invertible. 


Suppose A and B are square matrices of the same size and AB = 1. 
Prove that BA = I. 


Suppose T € £(V) has the same matrix with respect to every basis of V. 
Prove that T is a scalar multiple of the identity operator. 


Suppose u1, ..., un and v1,..., Vn are bases of V. Let T € L(V) be the 
operator such that Tv, = uz, fork = 1,..., n. Prove that 


MAT V1,- vn)) = M(I, (u1, .--, un), CA gang Va): 


Suppose B is a square matrix with complex entries. Prove that there 
exists an invertible square matrix A with complex entries such that 
AT! BA is an upper-triangular matrix. 


Give an example of a real vector space V and T € L(V) such that 
trace(T?) < 0. 


Suppose V is areal vector space, T € L(V), and V has a basis consisting 
of eigenvectors of T. Prove that trace(T?) > 0. 


10 


11 


12 


13 


14 
15 
16 


17 


18 
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Suppose V is an inner product space and v, w € V. Define T € L(V) by 
Tu = (u,v)w. Find a formula for trace T. 


Suppose P € L(V) satisfies P? = P. Prove that 


trace P = dimrange P. 


Suppose V is an inner product space and T € L(V). Prove that 


trace T* = trace T. 


Suppose V is an inner product space. Suppose T € L(V) is a positive 
operator and trace T = 0. Prove that T = 0. 


Suppose V is an inner product space and P, Q € L(V) are orthogonal 
projections. Prove that trace(PQ) > 0. 


Suppose T € £(C?) is the operator whose matrix is 


51 —12 —21 
60 —40 —28 
57 —68 1 


Someone tells you (accurately) that —48 and 24 are eigenvalues of T. 
Without using a computer or writing anything down, find the third eigen- 
value of T. 


Suppose T € L(V) and c € F. Prove that trace(cT) = c trace T. 
Suppose S, 7 € L(V). Prove that trace(ST) = trace(T S). 


Prove or give a counterexample: if $,7 € L(V), then trace($T) = 
(trace S')(trace T). 


Suppose T € L(V) is such that trace(S7) = 0 for all S € L(V). Prove 
that T = 0. 


Suppose V is an inner product space with orthonormal basis e1,..., en 
and T € L(V). Prove that 


trace(T*T) = |Tel? +--+ + ||Ten||?. 


Conclude that the right side of the equation above is independent of 
which orthonormal basis e1, ..., en is chosen for V. 
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Suppose V is an inner product space. Prove that 
(S,T) = trace(ST*) 


defines an inner product on L(V). 


Suppose V is a complex inner product space and T € L(V). Let 
A1,...,An be the eigenvalues of T, repeated according to multiplicity. 
Suppose 

Ail D Alun 

An... Ann 


is the matrix of T with respect to some orthonormal basis of V. Prove 
that 


n n 
JA? +--+ Ani? < >> > Ajal. 


k=1j=1 
Suppose V is an inner product space. Suppose T € £(V) and 
IT*v]| < |Tv] 


for every v € V. Prove that T is normal. 

[The exercise above fails on infinite-dimensional inner product spaces, 
leading to what are called hyponormal operators, which have a well- 
developed theory.] 
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10.B | Determinant 


Determinant of an Operator 


Now we are ready to define the determinant of an operator. Notice that the 
definition below mimics the approach we took when defining the trace, with 
the product of the eigenvalues replacing the sum of the eigenvalues. 


10.20 Definition determinant of an operator, det T 
Suppose T € L(V). 


e If F = C, then the determinant of T is the product of the eigenvalues 
of T, with each eigenvalue repeated according to its multiplicity. 


e IfF = R, then the determinant of T is the product of the eigenvalues 
of Tc, with each eigenvalue repeated according to its multiplicity. 


The determinant of T is denoted by det T. 


If Ay,..., Am are the distinct eigenvalues of T (or of Tc if V is a real 
vector space) with multiplicities d1, . . . , dm, then the definition above implies 


d dm 
det T = AZ)... Adm, 


Or if you prefer to list the eigenvalues with each repeated according to its 
multiplicity, then the eigenvalues could be denoted A1,...,An (where the 
index n equals dim V) and the definition above implies 


det T = Ay-++An. 


10.21 Example Suppose T € £(C?) is the operator whose matrix is 


3 =1 =2 
3 2 =3 
1 2 0 


Then the eigenvalues of T are 1, 2 + 37, and 2 — 3i, each with multiplicity 1, 
as you can verify. Computing the product of the eigenvalues, we find that 
det T = 1- (2 + 37) - (2 — 37); in other words, det T = 13. 
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The determinant has a close connection with the characteristic polynomial. 
Suppose A1,...,A, are the eigenvalues of T (or of Te if V is a real vector 
space) with each eigenvalue repeated according to its multiplicity. Then the 
expression for the characteristic polynomial of T given by 10.11 gives the 
following result. 


10.22 Determinant and characteristic polynomial 


Suppose T € L(V). Let n = dim V. Then det T equals (—1)” times the 
constant term of the characteristic polynomial of 7. 


Combining the result above and 10.12, we have the following result. 


10.23 Characteristic polynomial, trace, and determinant 


Suppose T € L(V). Then the characteristic polynomial of T can be 
written as 
eo (ies a ee d EN CET). 


We turn now to some simple but important properties of determinants. 
Later we will discover how to calculate det T from the matrix of T (with 
respect to an arbitrary basis). 

The crucial result below has an easy proof due to our definition. 


10.24 Invertible is equivalent to nonzero determinant 


An operator on V is invertible if and only if its determinant is nonzero. 


Proof First suppose V is a complex vector space and T € L(V). The 
operator T is invertible if and only if 0 is not an eigenvalue of T. Clearly this 
happens if and only if the product of the eigenvalues of T is not 0. Thus T is 
invertible if and only if det T ¥ 0, as desired. 

Now consider the case where V is a real vector space and T € L(V). 
Again, T is invertible if and only if 0 is not an eigenvalue of 7, which happens 
if and only if 0 is not an eigenvalue of Tc (because Tc and T have the same 
real eigenvalues by 9.11). Thus again we see that T is invertible if and only if 
det T Æ 0. m 


Some textbooks take the result below as the definition of the characteristic 
polynomial and then have our definition of the characteristic polynomial as a 
consequence. 
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10.25 Characteristic polynomial of T equals det(z/ — T) 


Suppose T € L(V). Then the characteristic polynomial of T equals 
det(z/ — T). 


Proof First suppose V is a complex vector space. If A, z € C, then À is an 
eigenvalue of T if and only if z — A is an eigenvalue of zI — T, as can be seen 
from the equation 


-(T —Al) = (zI -T)- (z -))l. 


Raising both sides of this equation to the dim V power and then taking null 
spaces of both sides shows that the multiplicity of A as an eigenvalue of T 
equals the multiplicity of z — A as an eigenvalue of zI — T. 

Let À1,..., Àn denote the eigenvalues of T, repeated according to mul- 
tiplicity. Thus for z € C, the paragraph above shows that the eigenvalues 
of zI — T are z — À1,...,Z — Àn, repeated according to multiplicity. The 
determinant of z/ — T is the product of these eigenvalues. In other words, 


det(zJ — T) = (z —A))---(Z — Àn). 


The right side of the equation above is, by definition, the characteristic poly- 
nomial of 7, completing the proof when V is a complex vector space. 

Now suppose V is areal vector space. Applying the complex case to Te 
gives the desired result. E 


Determinant of a Matrix 


Our next task is to discover how to compute det T from the matrix of T (with 
respect to an arbitrary basis). Let’s start with the easiest situation. Suppose 
V is a complex vector space, T € £(V), and we choose a basis of V as in 
8.29. With respect to that basis, T has an upper-triangular matrix with the 
diagonal of the matrix containing precisely the eigenvalues of T, each repeated 
according to its multiplicity. Thus det T equals the product of the diagonal 
entries of M (T) with respect to that basis. 

When dealing with the trace in the previous section, we discovered that the 
formula (trace = sum of diagonal entries) that worked for the upper-triangular 
matrix given by 8.29 also worked with respect to an arbitrary basis. Could that 
also work for determinants? In other words, is the determinant of an operator 
equal to the product of the diagonal entries of the matrix of the operator with 
respect to an arbitrary basis? 
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Unfortunately, the determinant is more complicated than the trace. In par- 
ticular, det T need not equal the product of the diagonal entries of M(T) with 
respect to an arbitrary basis. For example, the operator in Example 10.21 has 
determinant 13 but the product of the diagonal entries of its matrix equals 0. 

For each square matrix A, we want to define the determinant of A, denoted 
det A, so that det T = det M(T) regardless of which basis is used to com- 
pute M(7'). We begin our search for the correct definition of the determinant 
of a matrix by calculating the determinants of some special operators. 


10.26 Example Suppose a1,...,an € F. Let 


here all entries of the matrix are 0 except for the upper-right corner and 
along the line just below the diagonal. Suppose v1,..., Vv, is a basis of V and 
T € L(V) is such that M(T, (v1,... ,Vn)) = A. Find the determinant of 7. 


Solution First assume a; Æ 0 foreach j = 1,...,m — 1. Note that the list 
vi, Tvi, Tv1, ee i im equals v1, 41V2,4142V3,...,41°**Gn—1Vn. 
Computing the minimal polynomial J Thus v1, Tv1,..., jee is lin- 
is often an efficient method of find- early independent (because the a’s are 
ing the characteristic polynomial, || all nonzero). Hence if p is a monic poly- 
as is done in this example. | nomial with degree at most n — 1, then 
7 p(T)vı # 0. Thus the minimal poly- 
nomial of T cannot have degree less 
than n. 

As you should verify, T”v; = a,---dnvj; for each j. Thus we have 
T” = a,--+- anI. Hence z” — a1 -+- an is the minimal polynomial of T. Be- 
cause n = dim V and the characteristic polynomial is a polynomial multiple 
of the minimal polynomial (9.26), this implies that z” — a1 ---dy is also the 
characteristic polynomial of T. 

Thus 10.22 implies that 


det T = (—1)?~1a--- ap. 


If some a; equals 0, then Tv; = 0 for some j, which implies that 0 is an 
eigenvalue of T and hence det T = 0. In other words, the formula above also 
holds if some a; equals 0. 
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Thus in order to have det T = det M(T), we will have to make the deter- 
minant of the matrix in Example 10.26 equal to (—1)”7 ta; --- an. However, 
we do not yet have enough evidence to make a reasonable guess about the 
proper definition of the determinant of an arbitrary square matrix. 

To compute the determinants of a more complicated class of operators, we 
introduce the notion of permutation. 


10.27 Definition permutation, permn 


e A permutation of (1,...,n) is a list (™1,..., mn) that contains 
each of the numbers 1,..., exactly once. 
e The set of all permutations of (1,...,7) is denoted perm n. 


For example, (2,3, 4,5, 1) € perm 5. You should think of an element of 
permn as a rearrangement of the first n integers. 


10.28 Example Suppose a1,...,an € F and vj,..., vn is a basis of V. 
Consider a permutation (p1,..., Pn) E€ permn that can be obtained as fol- 
lows: break (1,...,7) into lists of consecutive integers and in each list move 


the first term to the end of that list. For example, taking n = 9, the permutation 
(2,3, 15556; 74, 9,8) 


is obtained from (1, 2, 3), (4, 5, 6, 7), (8, 9) by moving the first term of each of 
these lists to the end, producing (2, 3, 1), (5, 6, 7, 4), (9, 8), and then putting 
these together to form the permutation displayed above. 

Let T € L(V) be the operator such that 


Tvk = AkVp; 


for k = 1,...,n. Find det T. 


Solution This generalizes Example 10.26, because if (p1, ..., pn) is the 


permutation (2,3,...,”, 1), then our operator T is the same as the operator 
T in Example 10.26. 
With respect to the basis vj,..., Vy, the matrix of the operator T is a block 
diagonal matrix 
Ay 0 
A= is . 
0 Am 


where each block is a square matrix of the form of the matrix in 10.26. 
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Correspondingly, we can write V = V; © --- 6 Vy, where each V; is 
invariant under T and each T|y, is of the form of the operator in 10.26. 
Because det T = (det T|y,)--- (det T|y,,) (because the dimensions of the 
generalized eigenspaces in the V; add up to dim V), we have 


det T = (—1)"1 1... (-1)"™—14--- ap, 


where V; has dimension n ; (and correspondingly each A; has size n ;-by-n j) 
and we have used the result from 10.26. 


The number (—1)”!~!.-- (—1)"™“7~! that appears above is called the sign 
of the corresponding permutation (p1,..., Pn), denoted sign(p1,..., Pn) 
[this is a temporary definition that we will change to an equivalent definition 
later, when we define the sign of an arbitrary permutation]. 

To put this into a form that does not depend on the particular permutation 


(P1,.-+, Pn), let A j denote the entry in row j, column k, of the matrix A 
from Example 10.28. Thus 
a= if j A Pk; 
4 ak ifj = Pk- 
Example 10.28 shows that we want 
10.29 detA = D (sign(mı,...,Mn))Am,,1 + Amy,n; 
(m1,...,Mmn)Epermn 


note that each summand is 0 except the one corresponding to the permutation 
(P1,---, Pn) [which is why it does not matter that the sign of the other 
permutations is not yet defined]. 

We can now guess that det A should be defined by 10.29 for an arbitrary 
square matrix A. This will turn out to be correct. We will now dispense with 
the motivation and begin the more formal approach. First we will need to 
define the sign of an arbitrary permutation. 


10.30 Definition sign of a permutation 


e The sign of a permutation (m,,...,™,) is defined to be 1 if the 
number of pairs of integers (j,k) with 1 < j < k < n such that 
j appears after k in the list (m1,...,mn) is even and —1 if the 


number of such pairs is odd. 


e In other words, the sign of a permutation equals 1 if the natural 
order has been changed an even number of times and equals —1 if 
the natural order has been changed an odd number of times. 
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10.31 Example sign of permutation 


e The only pair of integers (j,k) with j < k such that j appears after 
k in the list (2, 1,3, 4) is (1, 2). Thus the permutation (2, 1, 3, 4) has 
sign —1. 


e In the permutation (2,3,...,”,1), the only pairs (j,k) with j < k 
that appear with changed order are (1, 2), (1,3),..., (1,7); because we 
have n — 1 such pairs, the sign of this permutation equals (—1)”~! (note 
that the same quantity appeared in Example 10.26). 


The next result shows that interchanging two entries of a permutation 
changes the sign of the permutation. 


10.32 Interchanging two entries in a permutation 


Interchanging two entries in a permutation multiplies the sign of the 
permutation by —1. 


Proof Suppose we have two permutations, where the second permutation is 
obtained from the first by interchanging two entries. If the two interchanged 
entries were in their natural order in the first permutation, then they no longer 
are in the second permutation, and vice versa, for a net change (so far) of 1 or 
—1 (both odd numbers) in the number of pairs not in their natural order. 

Consider each entry between the two [some texts use the term signum, 
interchanged entries. If an intermediate | which means the same as sign. 
entry was originally in the natural order ™ ; 
with respect to both interchanged entries, then it is now in the natural order 
with respect to neither interchanged entry. Similarly, if an intermediate entry 
was originally in the natural order with respect to neither of the interchanged 
entries, then it is now in the natural order with respect to both interchanged 
entries. If an intermediate entry was originally in the natural order with respect 
to exactly one of the interchanged entries, then that is still true. Thus the net 
change for each intermediate entry in the number of pairs not in their natural 
order is 2, —2, or 0 (all even numbers). 

For all the other entries, there is no change in the number of pairs not in 
their natural order. Thus the total net change in the number of pairs not in 
their natural order is an odd number. Thus the sign of the second permutation 
equals —1 times the sign of the first permutation. E 


Our motivation for the next definition comes from 10.29. 
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10.33 Definition determinant of a matrix, det A 


Suppose A is an n-by-n matrix 


Anil oos Alig 
4=| : 
Ail ooo Arar 
The determinant of A, denoted det A, is defined by 
det A = ME (sign(mı,...,Mn))Amı,1*** Aman: 
(m,...,;Mn)€permn 


10.34 Example determinants 


e If A is the 1-by-1 matrix [A1 1], then det A = Aj,1, because perm 1 
has only one element, namely (1), which has sign 1. 


e Clearly perm 2 has only two elements, namely (1,2), which has sign 1, 
and (2, 1), which has sign —1. Thus 


Ai 412 
det 4 : = A, .1A22—A21A12.- 
( eee ) 1,142,2 2,141,2 


, 


The set perm3 contains six ele-Ẹ To make sure you understand this 
ments. In general, permn contains process, you should now find the for- 
n! elements. Note that n! rapidly mula for the determinant of an arbitrary 
grows large as n increases. 3-by-3 matrix using just the definition 


given above. 


10.35 Example Compute the determinant of an upper-triangular matrix 


Aid * 
A= . 
0 Ann 
Solution The permutation (1,2,...,7) has sign 1 and thus contributes a term 
of A1,1 ++: An,n to the sum defining det A in 10.33. Any other permutation 
(m,,...,Mn) E€ permn contains at least one entry m; with m; > j, which 


means that Am ,,; = 0 (because A is upper triangular). Thus all the other 
terms in the sum in 10.33 make no contribution. 

Hence det A = A1,1 ++- Ann. In other words, the determinant of an upper- 
triangular matrix equals the product of the diagonal entries. 
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Suppose V is a complex vector space, T € L(V), and we choose a basis 
of V as in 8.29. With respect to that basis, T has an upper-triangular matrix 
with the diagonal of the matrix containing precisely the eigenvalues of T, 
each repeated according to its multiplicity. Thus Example 10.35 tells us that 
det T = det M(T), where the matrix is with respect to that basis. 

Our goal is to prove that det T = det M(T) for every basis of V, not just 
the basis from 8.29. To do this, we will need to develop some properties of 
determinants of matrices. The result below is the first of the properties we 
will need. 


10.36 Interchanging two columns in a matrix 


Suppose A is a square matrix and B is the matrix obtained from A by 
interchanging two columns. Then 


det A = —det B. 


Proof Think of the sum defining det A in 10.33 and the corresponding sum 
defining det B. The same products of A ;,,’s appear in both sums, although 
they correspond to different permutations. The permutation corresponding to 
a given product of A; ,’s when computing det B is obtained by interchanging 
two entries in the corresponding permutation when computing det A, thus 
multiplying the sign of the permutation by —1 (see 10.32). Hence we see that 
det A = —det B. m 


IfT € L(V) and the matrix of T (with respect to some basis) has two 
equal columns, then T is not injective and hence det T = 0. Although this 
comment makes the next result plausible, it cannot be used in the proof, 
because we do not yet know that det T = det M(T) for every choice of basis. 


10.37 Matrices with two equal columns 


If A is a square matrix that has two equal columns, then det A = 0. 


Proof Suppose A is a square matrix that has two equal columns. Interchang- 
ing the two equal columns of A gives the original matrix A. Thus from 10.36 
(with B = A), we have 

det A = —det A, 


which implies that det A = 0. 7 
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Recall from 3.44 that if A is an n-by-n matrix 


Ail ies Ain 
A=| : |, 
An. .-. Ann 


> ’ 


then we can think of the k" column of A as an n-by-1 matrix denoted A, x: 


Aik 
A. k = : 

An,k 
Some books define the determinant Note that A j,k, with two subscripts, de- 
to be the function defined on the notes an entry of A, whereas A. g, with 
square matrices that is linear as a dot as a placeholder and one subscript, 
a function of each column sepa- denotes a column of A. This notation 
rately and that satisfies 10.38 and allows us to write A in the form 
det = 1. To prove that such a 
Junction exists and that it is unique (Aa. An), 
takes a nontrivial amount of work. 


which will be useful. 
The next result shows that a permutation of the columns of a matrix 
changes the determinant by a factor of the sign of the permutation. 


10.38 Permuting the columns of a matrix 


Suppose A = ( A.ı1 ... A.» ) is ann-by-n matrix and (m1, ..., Mn) 
is a permutation. Then 


det( Ave Amn ) = (sign(mı, ...,mMn)) det A. 


Proof We can transform the matrix ( A.m, -.. A. m, )into A through a 
series of steps. In each step, we interchange two columns and hence multiply 
the determinant by —1 (see 10.36). The number of steps needed equals the 
number of steps needed to transform the permutation (m1, . . . , Mn ) into the 
permutation (1, ...,n) by interchanging two entries in each step. The proof 
is completed by noting that the number of such steps is even if (m1, ..., Mn) 
has sign 1, odd if (71,...,7™,) has sign —1 (this follows from 10.32, along 
with the observation that the permutation (1,...,7) has sign 1). E 
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The next result about determinants will also be useful. 


10.39 Determinant is a linear function of each column 


Suppose k,n are positive integers with 1 < k < n. Fix n-by-1 matrices 
A.1,...,A.n except A. g. Then the function that takes an n-by-1 column 


vector A. x to 
dei FAS oos Ae seo Aoa) 


is a linear map from the vector space of n-by-1 matrices with entries in F 


to F. 


Proof The linearity follows easily from 10.33, where each term in the sum 
contains precisely one entry from the k™ column of A. m 


Now we ae ready to prove DnE of |The result below was first proved j 
the key properties about determinants | jn 1812 by French mathematicians 


of square matrices. This property will | Jacques Binet and Augustin-Louis 
enable us to connect the determinant of | Cauchy. 
an operator with the determinant of its ` 
matrix. Note that this proof is considerably more complicated than the proof 
of the corresponding result about the trace (see 10.14). 


10.40 Determinant is multiplicative 


Suppose A and B are square matrices of the same size. Then 


det(AB) = det(BA) = (det A)(det B). 


Proof Write A=(A., ... A.» ), where each A. is an n-by-1 column 
of A. Also write 
Bia Mas Bin 
B= : : =( Bi B.n ), 
Bn...» Bnn 


where each B. x is ann-by-1 column of B. Let eg denote the n-by-1 matrix 
that equals 1 in the k" row and 0 elsewhere. Note that Ae, = A. and 
Be, = B. p. Furthermore, B. & = X m=1 Bm,kem- 

First we will prove det(AB) = (det A)(det B). As we observed ear- 
lier (see 3.49), the definition of matrix multiplication easily implies that 
AB =( AB.ı ... AB.n ). Thus 
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det(AB) = det( AB; ... AB.» ) 
= det( A(O%,, 1 Brym) e ACL, 21 Bmnn€my) ) 
= det( Emai Bm, 14m, ~- i BmynAem, ) 


n n 
= > DS Bm Bmpndetl Aem, < Almy ) 


m,=1 Myn=1 


where the last equality comes from repeated applications of the linearity of det 
as a function of one column at a time (10.39). In the last sum above, all terms 
in which m; = mx for some j Æ k can be ignored, because the determinant 
of a matrix with two equal columns is 0 (by 10.37). Thus instead of summing 


over all m1,...,™n with each m ; taking on values 1, ...,n, we can sum just 
over the permutations, where the m;’s have distinct values. In other words, 
det(AB) = > By >>- Bm, ndet( Aem, gx, Aem, ) 


(m1,...,Mn)Epermn 


= > Bm, ,1°** Bryn (sign(my, ...,17n)) det A 


(m,...,Mn)€permn 


= (det A) > (sign(my, can ,Mn)) Bm, ,1 -*+ Bm, wn 
(m ,...,n)€permn 
= (det A)(det B), 


where the second equality comes from 10.38. 
In the paragraph above, we proved that det(AB) = (det A)(det B). In- 
terchanging the roles of A and B, we have det(BA) = (det B)(det A). The 


last equation can be rewritten as det(BA) = (det A)(det B), completing the 
proof. 


a 
Note the similarity of the proof of Now we can prove that the determi- 
the next result to the proof of the nant of the matrix of an operator is in- 
analogous result about the trace dependent of the basis with respect to 
(see 10.15). which the matrix is computed. 


10.41 Determinant of matrix of operator does not depend on basis 


Let T € L(V). Suppose u1, ..., un and vj,..., Vy, are bases of V. Then 


det M (T, (u1, ..., Un)) = det MACH yy nese Yn)). 
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Proof Let A = M(I, (uy,...,Un), (V1,...- ,Vn)). Then 
det M(T, (1, --- ,Un)) = det( A1 (M(T, (v1... ¥n)) 4) 


= det((M(T. #1,- ,vn)) A) A7!) 
= det M(T, (v1,...,¥n)), 


where the first equality follows from 10.7 and the second equality follows 
from 10.40. The third equality completes the proof. E 


The result below states that the determinant of an operator equals the 
determinant of the matrix of the operator. This theorem does not specify a 
basis because, by the result above, the determinant of the matrix of an operator 
is the same for every choice of basis. 


10.42 Determinant of an operator equals determinant of its matrix 
Suppose T € L(V). Then det T = det M (T). 


Proof As noted above, 10.41 implies that det M (T) is independent of which 
basis of V we choose. Thus to show that det T = det M(T) for every basis 
of V, we need only show that the result holds for some basis of V. 

As we have already discussed, if V is a complex vector space, then choos- 
ing a basis of V as in 8.29 gives the desired result. If V is a real vector space, 
then applying the complex case to the complexification Tc (which is used to 
define det T) gives the desired result. n 


If we know the matrix of an operator on a complex vector space, the result 
above allows us to find the product of all the eigenvalues without finding any 
of the eigenvalues. 


10.43 Example Suppose T is the operator on C? whose matrix is 


0 0 0 0 -3 
1000 6 
0100 0 
0010 0 
0001 0 


No one knows an exact formula for any of the eigenvalues of this operator. 
However, we do know that the product of the eigenvalues equals —3, because 
the determinant of the matrix above equals —3. 
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We can use 10.42 to give easy proofs of some useful properties about 
determinants of operators by shifting to the language of determinants of 
matrices, where certain properties have already been proved or are obvious. 
We carry out this procedure in the next result. 


10.44 Determinant is multiplicative 
Suppose S, T € L(V). Then 


det(ST) = det(TS) = (det S)(det T). 


Proof Choose a basis of V. Then 


det(S7) = det M(ST) 
= det(M(S)M(T)) 
= (det M(S)) (det M(T)) 
= (det S)(det T), 


where the first and last equalities come from 10.42 and the third equality 
comes from 10.40. 

In the paragraph above, we proved that det(S 7) = (det S)(det T). Inter- 
changing the roles of S and T, we have det(T'S) = (det T)(det S). Because 
multiplication of elements of F is commutative, the last equation can be 
rewritten as det(T S) = (det S)(det T), completing the proof. E 


The Sign of the Determinant 


We proved the basic results of linear algebra before introducing determinants 
in this final chapter. Although determinants have value as a research tool in 
more advanced subjects, they play little role in basic linear algebra (when the 
subject is done right). 


Most applied mathematicians Determinants do have one important 
agree that determinants should application in undergraduate mathemat- 
rarely be used in serious numeric ics, namely, in computing certain vol- 
calculations. [| umes and integrals. In this subsection 


we interpret the meaning of the sign of 
the determinant on a real vector space. Then in the final subsection we will 
use the linear algebra we have learned to make clear the connection between 
determinants and these applications. Thus we will be dealing with a part of 
analysis that uses linear algebra. 
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We will begin with some purely linear algebra results that will also be 
useful when investigating volumes. Our setting will be inner product spaces. 
Recall that an isometry on an inner product space is an operator that preserves 
norms. The next result shows that every isometry has determinant with 
absolute value 1. 


10.45 Isometries have determinant with absolute value 1 


Suppose V is an inner product space and S € L(V) is an isometry. Then 
|det S| = 1. 


Proof First consider the case where V is a complex inner product space. 
Then all the eigenvalues of S have absolute value 1 (see the proof of 7.43). 
Thus the product of the eigenvalues of S, counting multiplicity, has absolute 
value one. In other words, |det S| = 1, as desired. 

Now suppose V is a real inner product space. We present two different 
proofs in this case. 

Proof 1: With respect to the inner product on the complexification Vc given 
by Exercise 3 in Section 9.B, it is easy to see that Sc is an isometry on Vc. 
Thus by the complex case that we have already done, we have |det Sc| = 1. 
By definition of the determinant on real vector spaces, we have det S = det Sc 
and thus |det S| = 1, completing the proof. 

Proof 2: By 9.36, there is an orthonormal basis of V with respect to which 
M(S) is a block diagonal matrix, where each block on the diagonal is a 
1-by-1 matrix containing 1 or —1 or a 2-by-2 matrix of the form 


cos —sin0 

sinô cos@ J’ 
with 0 € (0, x). Note that the determinant of each 2-by-2 matrix of the form 
above equals 1 (because cos? 6 + sin? 0 = 1). Thus the determinant of S, 


which is the product of the determinants of the blocks (see Exercise 6), is the 
product of 1’s and —1’s. Hence, |det S| = 1, as desired. a 


The Real Spectral Theorem 7.29 states that a self-adjoint operator T on a 
real inner product space has an orthonormal basis consisting of eigenvectors. 
With respect to such a basis, the number of times each eigenvalue appears on 
the diagonal of M (T) is its multiplicity. Thus det T equals the product of its 
eigenvalues, counting multiplicity (of course, this holds for every operator, 
self-adjoint or not, on a complex vector space). 
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Recall that if V is an inner product space and T € L(V), then T*T isa 
positive operator and hence has a unique positive square root, denoted v T*T 
(see 7.35 and 7.36). Because VT*T is positive, all its eigenvalues are non- 
negative (again, see 7.35), and hence det VT*T > 0. These considerations 
play a role in next example. 


10.46 Example Suppose V is a real inner product space and T € L(V) 
is invertible (and thus det T is either positive or negative). Attach a geometric 
meaning to the sign of det 7. 


Solution First we consider an isometry S € L(V). By 10.45, the determinant 
of S equals 1 or —1. Note that 


{ve V : Sv = —v} 


We are not formally defining the | is the eigenspace E(-1,8). Thinking 
phrase “reverses direction” be- geometrically, we could say that this 
cause these comments are meant is the subspace on which S reverses 
only as an intuitive aid to our un- direction. An examination of proof 
derstanding. | 2 of 10.45 shows that detS = 1 if 
f this subspace has even dimension and 
det S = —1 if this subspace has odd 
dimension. 

Returning to our arbitrary invertible operator T € L(V), by the Polar 

Decomposition (7.45) there is an isometry S € L(V) such that 


T=SVT*T. 


Now 10.44 tells us that 
det T = (det S)(det VT*T). 


The remarks just before this example pointed out that det VT*T > 0. Thus 
whether det T is positive or negative depends on whether det S is positive or 
negative. As we saw in the paragraph above, this depends on whether the 
subspace on which S reverses direction has even or odd dimension. 

Because T is the product of S and an operator that never reverses direction 
(namely, VT*T), we can reasonably say that whether det T is positive or 
negative depends on whether T reverses vectors an even or an odd number of 
times. 
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Volume 
The next result will be a key tool in our investigation of volume. Recall that 
our remarks before Example 10.46 pointed out that det VT*T > 0. 

10.47 9 \det? |= der PT 

Suppose V is an inner product space and T € L(V). Then 


det = det VT*T. 


Proof 


By the Polar Decomposition (7.45), Another proof of this result is sug-\j 
there is an isometry S € L(V) such | gested in Exercise 8. 


that n 
T=SVT*T. 


Thus 


|det T| = |det S| det VT*T 
= det /T*T, 


where the first equality follows from 10.44 and the second equality follows 
from 10.45. m 


Now we turn to the question of volume in R”. Fix a positive integer n for 
the rest of this subsection. We will consider only the real inner product space 
R”, with its standard inner product. 

We would like to assign to each subset Q of R” its n-dimensional volume 
(when n = 2, this is usually called area instead of volume). We begin with 
boxes, where we have a good intuitive notion of volume. 


10.48 Definition box 
A box in R” is a set of the form 

O ERE tor eects 
where r1,...,/, are positive numbers and (x1, ..., Xn) € R”. The num- 


bers r1,...,7n are called the side lengths of the box. 


You should verify that when n = 2, a box is a rectangle with sides parallel 
to the coordinate axes, and that when n = 3, a box is a familiar 3-dimensional 
box with sides parallel to the coordinate axes. 
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The next definition fits with our intuitive notion of volume, because we 
define the volume of a box to be the product of the side lengths of the box. 


10.49 Definition volume of a box 


The volume of a box B in R” with side lengths 71,..., 7, is defined to 
be r1 --+ 7, and is denoted by volume B. 


Readers familiar with outer mea-| To define the volume of an arbitrary 
sure will recognize that concept|, set 2 C R”, the idea is to write Q as a 
here. {| subset of a union of many small boxes, 


then add up the volumes of these small 
boxes. As we approximate £2 more accurately by unions of small boxes, we 
get a better estimate of volume (2. 


10.50 Definition volume 


Suppose Q C R”. Then the volume of Q, denoted volume Q, is defined 
to be the infimum of 


volume Bı + volume B2 +--+, 


where the infimum is taken over all sequences B1, B2, ... of boxes in R” 
whose union contains &2. 


We will work only with an intuitive notion of volume. Our purpose in this 
book is to understand linear algebra, whereas notions of volume belong to 
analysis (although volume is intimately connected with determinants, as we 
will soon see). Thus for the rest of this section we will rely on intuitive notions 
of volume rather than on a rigorous development, although we shall maintain 
our usual rigor in the linear algebra parts of what follows. Everything said 
here about volume will be correct if appropriately interpreted—the intuitive 
approach used here can be converted into appropriate correct definitions, 
correct statements, and correct proofs using the machinery of analysis. 


10.51 Notation T(Q) 
For T a function defined on a set Q, define T (Q) by 
T(Q) Shite ] 2 


For T € £(R”) and Q C R”, we seek a formula for volume T (Q) in 
terms of T and volume Q. We begin by looking at positive operators. 
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10.52 Positive operators change volume by factor of determinant 


Suppose T € £(R”) is a positive operator and Q C R”. Then 


volume T (Q) = (det T)(volume 2). 


Proof To get a feeling for why this result is true, first consider the special 
case where A1,...,A, are positive numbers and T € £(R”) is defined by 


T(x1,..-,Xn) = (Aq x4,..., AnXn). 


This operator stretches the j™ standard basis vector by a factor of À ;. If B 
is a box in R” with side lengths r1,...,rn, then T(B) is a box in R” with 
side lengths Ayr,..., A,r. The box T(B) thus has volume 41 -+-Anri-++Tn, 
whereas the box Q has volume r1 +-+- rn. Note that det T = A, -++ An. Thus 


volume T(B) = (det T)(volume B) 


for every box B in R”. Because the volume of Q is approximated by sums of 
volumes of boxes, this implies that volume T (Q) = (det T)(volume 2). 
Now consider an arbitrary positive operator T € £(R”). By the Real 
Spectral Theorem (7.29), there exist an orthonormal basis ¢€1,..., €n of R” 
and nonnegative numbers A1,...,A, such that Te; = A;e; for j = 1,...,n. 
In the special case where e1,..., en is the standard basis of R”, this operator 
is the same one as defined in the paragraph above. For an arbitrary orthonor- 
mal basis €1,..., €n, this operator has the same behavior as the one in the 
paragraph above—it stretches the j" basis vector in an orthonormal basis by 
a factor of A ;. Your intuition about volume should convince you that volume 
behaves the same with respect to each orthonormal basis. That intuition, and 
the special case of the paragraph above, should convince you that T multiplies 
volume by a factor of Ay -++ An, which again equals det T. m 


Our next tool is the following result, which states that isometries do not 
change volume. 


10.53 An isometry does not change volume 


Suppose S € £(R”) is an isometry and Q C R”. Then 


volume S (Q) = volume Q. 


326 CHAPTER 10 Trace and Determinant 


Proof For x,y € R”, we have 


[Sx — Syl] = [Sœ -= y)| 
= |x- yll 

In other words, S does not change the distance between points. That property 
alone may be enough to convince you that S does not change volume. 

However, if you need stronger persuasion, consider the complete descrip- 
tion of isometries on real inner product spaces provided by 9.36. According to 
9.36, S can be decomposed into pieces, each of which is the identity on some 
subspace (which clearly does not change volume) or multiplication by —1 on 
some subspace (which again clearly does not change volume) or a rotation 
on a 2-dimensional subspace (which again does not change volume). Or use 
9.36 in conjunction with Exercise 7 in Section 9.B to write S as a product of 
operators, each of which does not change volume. Either way, you should be 
convinced that S' does not change volume. a 


Now we can prove that an operator T € £(R”) changes volume by a factor 
of |det T|. Note the huge importance of the Polar Decomposition in the proof. 


10.54 T changes volume by factor of |det 7 | 
Suppose T € £(R”) and Q C R”. Then 


volume T (Q) = |det T |(volume Q). 


Proof By the Polar Decomposition (7.45), there is an isometry S € L(V) 


such that 
T = SNT*T. 
If Q C R”, then T(Q) = S(VT*T(Q)). Thus 


volume T (Q) = volume S(v T*T(Q)) 
volume V T*T (Q) 


(det VT *T) (volume Q) 
= |det T|(volume Q), 


II 


II 


where the second equality holds because volume is not changed by the isom- 
etry S (by 10.53), the third equality holds by 10.52 (applied to the positive 
operator v T*T), and the fourth equality holds by 10.47. E 
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The result that we just proved leads to the appearance of determinants in 
the formula for change of variables in multivariable integration. To describe 
this, we will again be vague and intuitive. 

Throughout this book, almost all the functions we have encountered have 
been linear. Thus please be aware that the functions f and o in the material 
below are not assumed to be linear. 

The next definition aims at conveying the idea of the integral; it is not 
intended as a rigorous definition. 


10.55 Definition integral, fo f 


If Q C R” and f is a real-valued function on Q, then the integral of f 
over Q, denoted fo f or fo f(x) dx, is defined by breaking Q into pieces 
small enough that f is almost constant on each piece. On each piece, 
multiply the (almost constant) value of f by the volume of the piece, then 
add up these numbers for all the pieces, getting an approximation to the 
integral that becomes more accurate as Q is divided into finer pieces. 


Actually, Q in the definition above needs to be a reasonable set (for 
example, open or measurable) and f needs to be a reasonable function (for 
example, continuous or measurable), but we will not worry about those 
technicalities. Also, notice that the x in fo f(x)dx is a dummy variable and 
could be replaced with any other symbol. 

Now we define the notions of differentiable and derivative. Notice that 
in this context, the derivative is an operator, not a number as in one-variable 
calculus. The uniqueness of T in the definition below is left as Exercise 9. 


10.56 Definition differentiable, derivative, o'(x) 


Suppose Q is an open subset of R” and o is a function from Q to R”. 
For x € Q, the function o is called differentiable at x if there exists an 
operator T € £(R”) such that 


im [2+ — 9X) = Tyl _ 


0. 
yO lly Il 


If o is differentiable at x, then the unique operator T € £(R”) satisfying 
the equation above is called the derivative of o at x and is denoted by 
o'(x). 
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The idea of the derivative is that 


If n = 1, then the derivative in the | 
for x fixed and ||y|| small, 


sense of the definition above is the 
operator on R of multiplication by 
the derivative in the usual sense of 
one-variable calculus. 


a(x + y) © a(x) + (o'(x))(y); 


because o’(x) € L(R”), this makes 
sense. 

Suppose Q is an open subset of R” and o is a function from Q to R”. We 
can write 


a(x) = (1x), ..., on(x)), 


where each o; is a function from Q to R. The partial derivative of 0; 
with respect to the k" coordinate is denoted Dgo j. Evaluating this partial 
derivative at a point x € Q gives Do; (x). If o is differentiable at x, then the 
matrix of o’(x) with respect to the standard basis of R” contains Dgo; (x) in 
row j, column k (this is left as an exercise). In other words, 


Dyo,(x) ... Dynoy(x) 
10.57 M (o' (x)) = : : 
Dyon(x) ... Dnon(X) 


Now we can state the change of variables integration formula. Some 
additional mild hypotheses are needed for f and o’ (such as continuity or 
measurability), but we will not worry about them because the proof below is 
really a pseudoproof that is intended to convey the reason the result is true. 

The result below is called a change of variables formula because you can 
think of y = a(x) as a change of variables, as illustrated by the two examples 
that follow the proof. 


10.58 Change of variables in an integral 


Suppose Q is an open subset of R” and o : Q — R” is differentiable at 
every point of Q. If f is a real-valued function defined on o (Q), then 


/ fo) ay = iL E Clee 
a(Q) Q 


Proof Let x € Q and let T be a small subset of Q containing x such that f 
is approximately equal to the constant f (o (x)) on the set o (T`). 

Adding a fixed vector [such as o(x)] to each vector in a set produces 
another set with the same volume. Thus our approximation for ø near x using 
the derivative shows that 
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volume o (T) ~ volume[(o’(x))(I*)]. 
Using 10.54 applied to the operator o’(x), this becomes 
volume o (T)  |deto’(x)|(volume T). 


Let y = o (x). Multiply the left side of the equation above by f(y) and the 
right side by f (o(x)) [because y = o (x), these two quantities are equal], 
getting 


f(y)volumeo (T) ~ f (o(x)) |det o’(x)|(volume T). 


Now break Q into many small pieces and add the corresponding versions of 
the equation above, getting the desired result. E 


The key point when making a change of variables is that the factor of 
|det o’ (x)| must be included when making a substitution y = f(x), as in the 
right side of 10.58. We finish up by illustrating this point with two important 
examples. 


10.59 Example polar coordinates 
Define o : R? —> R? by 


o(r,0) = (rcos9,rsin@), 


where we have used r,@ as the coordinates instead of x1, x2 for reasons 
that will be obvious to everyone familiar with polar coordinates (and will 
be a mystery to everyone else). For this choice of o, the matrix of partial 
derivatives corresponding to 10.57 is 


cos@ —rsiné 

sinô rcos@ J’ 
as you should verify. The determinant of the matrix above equals r, thus 
explaining why a factor of r is needed when computing an integral in polar 
coordinates. 


For example, note the extra factor of r in the following familiar formula 
involving integrating a function f over a disk in R?: 


1 ~y 1—x2 27 1 
x, y)dy dx = J r cos ,r sin )r dr dé. 
Fol pth y)dy 0 0 At 
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10.60 Example spherical coordinates 
Define o : R? > R? by 


a(p, 9, 8) = (psing cos 0, psing sin 0, pcos 9), 


where we have used p, 6, g as the coordinates instead of x1, x2, x3 for reasons 
that will be obvious to everyone familiar with spherical coordinates (and will 
be a mystery to everyone else). For this choice of o, the matrix of partial 
derivatives corresponding to 10.57 is 


singcos@ pcosgcos@ —psing sin 0 
singsin@ pcosgsin@ psingcosé@ |, 
COs p —psing 0 


as you should verify. The determinant of the matrix above equals p°? sing, 
thus explaining why a factor of p? sin ø is needed when computing an integral 
in spherical coordinates. 

For example, note the extra factor of p? sing in the following familiar 
formula involving integrating a function f over a ball in R3: 


1 V1—x2 a/1—x2—y2 
(x, y,z)dzdydx 
LI aLa x2 af í : 


2m pia 1 
= J | / f(psing cos 6, psing sin 6, pcos y)p” sing dp dg dé. 
o Jo Jo 


EXERCISES 10.B 


1 Suppose V is areal vector space. Suppose T € L(V) has no eigenvalues. 
Prove that det T > 0. 


2 Suppose V is a real vector space with even dimension and T € L(V). 
Suppose det T < 0. Prove that T has at least two distinct eigenvalues. 


3 Suppose T € L(V) andn = dimV > 2. Let Aj,...,An denote the 
eigenvalues of T (or of Tc if V is a real vector space), repeated according 
to multiplicity. 


(a) Find a formula for the coefficient of z”~? in the characteristic 
polynomial of T in terms of A1,...,An. 


(b) Find a formula for the coefficient of z in the characteristic polyno- 
mial of T in terms of å1,..., Àn. 


11 
12 
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Suppose T € L(V) and c € F. Prove that det(c T) = c4'™” det T. 


Prove or give a counterexample: if $,T € L(V), then det(S + T) = 
det S + det T. 


Suppose A is a block upper-triangular matrix 


Ay * 
A= ; 
0 Am 
where each A; along the diagonal is a square matrix. Prove that 


det A = (det A1) +- - (det Am). 


Suppose A is an n-by-n matrix with real entries. Let S € £(C”) denote 
the operator on C” whose matrix equals A, and let T € £(R”) denote 
the operator on R” whose matrix equals A. Prove that trace S = trace T 
and det S = det T. 


Suppose V is an inner product space and T € L(V). Prove that 
det T* = det T. 


Use this to prove that |det T| = det VT*T , giving a different proof than 
was given in 10.47. 


Suppose Q is an open subset of R” and o is a function from Q to R”. 
Suppose x € Q and o is differentiable at x. Prove that the operator 
T € L(R”) satisfying the equation in 10.56 is unique. 

[This exercise shows that the notation o'(x) is justified. | 


Suppose T € £(R”) and x € R”. Prove that T is differentiable at x and 
T(x) = 7, 


Find a suitable hypothesis on o and then prove 10.57. 
Let a, b, c be positive numbers. Find the volume of the ellipsoid 


2 2 2 
3.x y Z 
{(x.y.2) ER a a 
by finding a set 2 C R? whose volume you know and an operator 
T € L(R°) such that T(Q) equals the ellipsoid above. 
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